article

You Have to See the Cyber Criminal to Catch the Criminal

35 min read

You have to see the cyber criminal to catch the criminal. The most relevant data to monitor, ranked.

[VIDEO] Watch "You have to see the criminal to catch the criminal. The most relevant data to monitor, ranked" webinar with the SANS Institute, or jump to your preferred topic below.

Teams working on network monitoring and workstation security need to know what data to monitor and what to prioritize.

In this previously recorded webinar with the SANS Institute, Critical Insight CTO Mike Simon and SANS Security Instructor Brandon McCrillis provide actionable take-aways to make network monitoring routines more efficient.

With these tips, InfoSec professionals can learn not only WHAT priority data sources to collect, but WHY these data sources are higher priority for monitoring cybersecurity-related activities on the network.

In Mike’s presentation (starting at 00:36 min), he elaborates upon the following:

How to best use each log type to identify and prioritize the most critical data sources:
- A prioritized list of Data Sources, often specific to a product or operating system (i.e., Active Directory logs)
- A prioritized list of Log Types, which can often answer the why questions (i.e., user logins)
How to rank that information from most critical to least critical
How teams can work together to improve network monitoring and workstation security

Mike also covers why this type of prioritization is so important for those managing the day-to-day tasks related to network monitoring. Mike also offers guidance on how to handle de-prioritized sources that may actually hinder network monitoring.

Presentation by Brandon McCrillis, SANS Instructor

Presentation by Mike Simon, CTO, Critical Insight

WEBINAR TRANSCRIPT
Editor's Note: The following features CTO Mike Simon’s transcript from the webinar.

Mike Simon: 37:23

Thank you Brandon and I have to sort of interject here. One of the most important things you had to say, I think in all of this was essentially that you do have to prioritize. We're going to launch here into a little bit of how to have the conversation with the rest of your network team, with the rest of your organization about what you collect and why you do so. I'm also going to poke a little bit of fun at you by saying that you at simultaneously said, "Oh yeah, don't collect all the things." That's the wrong attitude and then you named all the things. Truly what happened there was he didn't name all the things. There are so many data sources that are available to a modern network or modern network administrator or security team that seemingly somewhat exhaustive list that Brandon went through is actually scratching the surface. What I'm going to do here quickly is walk through how to categorize those things in a way that lends itself to great conversations about why you're collecting something and what you expect to be able to do with it. Your data collection and your use of that and your incident response has to all be purposeful. So let's dive into that. Just a little bit.

Mike Simon: 38:48

One of my great... Actual, not quotes is this one from Willy Sutton. If you're not familiar with Willy, he was a bank robber that evaded, I believe the FBI for nearly 40 years gained some notoriety, had a couple of books written supposedly by with him. Why do you Rob banks? Well, because that's where the money is. Why do we look at logs? Why are we talking about the things we're talking about today? It's because that's where the evidence of what's happening on your network is these are the sources of truth regarding what's happening on your network. If you roll all of those things up into the most important, the most clear source of truth.

Mike Simon: 39:41

Brandon talked a bit about packet capture. Packet capture is truth, whatever happened in a packet capture happened on your network. We saw it, it went past us. Those bits are true, but it's extraordinarily voluminous. Brandon mentioned I think a couple of times that he'd love to be able to capture 90 days of packet capture for the edge of most networks. A solution that CIA puts in place, we capture 14 days. That's our baseline guarantee you if we have more space then on a collector we collect more. But for the very reason that Brandon pointed out among all of the things happening that packet capture is complete. It is a complete view of what happened at the point of capture.

Mike Simon: 40:36

Let's see, some quick disclaimers. We talked a lot about logs and events and data and things like NetFlow. NetFlow is actually a structured data source. It arrives from the network not truly as a log. If we wanted to have a half hour gunfight regarding what we really should call all these things. That's enjoyable and for a different podcast or a different cast. I'm going to refer to all of these things pretty typically as logs. They are that atomic thing that you can collect from a data source that tells you something about what's going on. So you've got all of those things that Brandon mentioned. You've got network devices and point servers and stuff that generate a ton of information. If you're wondering about how big that ton of information is, turn on all the logging from all your stuff and centralize it and see how fast your disc goes away. Packet capture again is the most voluminous of these. On an average week we're looking at a little bit over two petabytes of packet capture just in general and extracting artifacts from that packet capture.

Mike Simon: 42:08

Logs on the other hand, don't take up all that space, all that much space, especially as you compress them for archive and so on. I'll talk a little bit about what that looks later on. Properly analyzed though, you get useful data. Unfortunately, if you collect all of the things, there's too much of it and by volume most of it's useless of all of the things that your system is logging, all of the things that you might collect, the greater bulk of it is actually not useful in terms of analysis for information security for events and information security.

Mike Simon: 42:51

I don't know about you, but I don't have infinite time storage or patience to look through everything. So Brandon had a pretty good slide regarding this. I'm not going to have to spend a ton of time on it. In general, what you're trying to do with all the information you're collecting has a lot of different dimensions. You want to attract operational attributes. How is your network doing? Are you running out of disc space? All kinds of stuff that your systems are telling you about.

Mike Simon: 43:20

You are trying to meet regulatory requirements, depending on what regulatory environment you're living in, healthcare is one of the most... And most interesting ones in terms of the fact that HIPAA has been around for quite a long time, has a lot of prescriptive things about what you'll collect and what you'll keep track of and so on and so forth. My other favorite actually is the North American Electric Reliability Council, SIP requirements for electric utilities. Very prescriptive. You will keep the following things.

Mike Simon: 43:50

There is some overlap between those regulatory required things and security monitoring, but it's not 100%. You're also in a lot of cases creating a forensic record, so for example, while watching file system changes on an endpoint can be interesting and in fact is extremely interesting. When a ransomware event occurs, you start to see after the fact what the extent of that attack was, what they were trying for first, potentially even detecting it early stages using that kind of monitoring. 99.9999999 some extraordinary number of nines percent of the time. It's your users accessing files.

Mike Simon: 44:40

So forensically very, very useful in that case, detectively occasional useful and then monitoring for security threats kind of the focus of what we're talking about now. When you're having this conversation with your ops team, if you're in information security yourself, when you're having this conversation with the ops team and they say, "Yeah, I really want to know the uptime on the firewall. We want to keep track of that." Yes, they need to know that. And if you give Brandon I think, or myself, any bit of information, I can tell you how it could be used to monitor security situations on your network.

Mike Simon: 45:27

I'll diverge briefly for a short story here on the CPU load on a firewall was once used to exfiltrate data via ping timing from the outside. It was an unusual circumstance, very high-profile target. Firewall CPU load was actually being modulated such that the ping times were modulated such that you could actually through literally, in this case, Morse code, send data from the inside to the outside with actually not sending any real data from the inside of the outside. So yes...

Mike Simon: 46:03

From the inside to the outside. So yes, operational data can and is used as security monitoring. For the most part though, you want to focus on those things that you can most directly attribute to attack.

Mike Simon: 46:21

Regulatory requirements. If you're within a regulatory environment, you must do these things, or you have the potential being fined. One of my other favorite things about NERC CIP requirements is that they're fining authority is, I believe, still $1 million per day, per incident backdated to whoever you started screwing up.

Mike Simon: 46:41

That particular regulation, people pay a fair amount of attention to, I think it goes back to the, why aren't you turning off your NT4O boxes for Brandon's Talk? You don't turn those boxes off if they're each generate $1 million a day. You make sure you meet your regulatory requirements because they might cost you $ 1 million per day. So it usually comes with a very prescriptive list and overlaps with operational and security as well.

Mike Simon: 47:10

So the last category here, outage and failure and exception logs when things break, when things aren't working, when you're monitoring over them shows that they're either degrading or completely unacceptable or unavailable. There's a greater overlap in some cases with security. So for example, when things start going offline, that could be because from an IT perspective they broke or it could be through at attack, either a denial of service attack of some sort or through another kind of a degradation of that service by an attacker. So there is a fair amount of overlap there.

Mike Simon: 47:47

There's a tremendous amount of overlap with forensic capture for failures and exceptions. Again, for ransomware, as things start to fail, you want as much detail about how and why after the fact to be able to figure out what that attack vector was to be able to figure out how you might've prevented it and in the context of this discussion to figure out what you should have been monitoring to tell you about it as early as possible.

Mike Simon: 48:20

The temptation is to just start with a list of stuff and justify the existence of your list and say, here's this big un-categorized pile. If you start with categories and talk about why. Why do I want this, what will it do for me? And that's actually what we're going to walk through in the rest of this presentation is that why bit. You have a much better conversation actually with those people that you're interacting with in the organization.

Mike Simon: 48:53

You're able to say, I need this, and need is an important aspect here, I need this in order to see when an attacker is performing a password spray for example. Of all of the types of sources that Brandon mentioned in his presentation, you heard login come up time and time again and it's not because Brandon has a particular affinity for login, it's because it's one of the most important, if not the most important sources that you can use to understand what's going on from a security perspective in your network.

Mike Simon: 49:29

If you're a Windows shop, there's some very specific aspects of Windows loggings that will give you that, but one of the reasons to categorize is to be able to also say, well, I also want these tach hacks login streams from my network infrastructure. I also want radius from, let's say some of my edge devices. Those are also login streams, also critically important to understanding attacks against that infrastructure.

Mike Simon: 49:58

So why are we going to go down this path? Well, one of the things I always suggest to people is if somebody has a framework that you can use for this kind of categorization, use it. There's a few out there. Inventing your own means you'll have to maintain a framework and try to keep track. You know it basically, if you're not a security professional whose job it is to maintain one of these frameworks, don't do it. Pick an attack categorization framework.

Mike Simon: 50:30

Because what we're really trying to look for here is evidence of attacks. In some cases, evidence of the successful attack and the subsequent actions on that attack. But an attack framework works really well. Actually, when Brandon was raised as the co presenter by SANS for this presentation, I was really overjoyed to see what his background was. When I'm having these conversations either internally from my company as we build out detection systems or with customers, or consulting clients in the field.

Mike Simon: 51:04

That attack scenario, that red team view of things is one of the most important interesting conversations you can have with respect to, "Hey Brandon, what terrifies you in terms of monitoring? What is the thing that you least want that target to be doing? Because that's what I want to focus on first."

Mike Simon: 51:24

So you want the high-level stuff of things to be looking for from that framework. You want a lot of detail about what that looks like when you see it. So what does a login failure look like? Well the characteristics of a login failure are usually, there's a source that initiated it. There is an account that was attempted to be logged into.

Mike Simon: 51:51

There was a destination, the host that was attempted to host the system that tend to be logged into, but between domain controller logs, radius and tech hacks, the formats are all different. And that again is why you do this high-level categorization so you can kind of say, I don't care what the format is of the source, I just know I want lots of login stuff. So we can move on a little bit.

Mike Simon: 52:22

There are lots of possibilities for the frameworks you could choose. Lockheed Martin years and years and years ago came out with a very useful tool called the intrusion kill chain are often referred to as the cyber kill chain. That is essentially a mechanism for describing those things that must happen for an intrusion to be successful.

Mike Simon: 52:44

The idea that if you can interrupt certain aspects of that kill chain or kill a piece of that chain, the attack itself will be unsuccessful. You know, starting with reconnaissance, where the attackers are trying to figure out what resources to attack, what you have, what's your edge looks like, how to get in through that edge and so on and so forth.

Mike Simon: 53:09

MITRE's adversarial tactics, techniques and common knowledge. The ATT&CK framework spelled correctly there with the ampersand, is the one I tend to use internally. It's modern, it's complete. I'll show you a quick screenshot of how complete it is, I think in the next slide. Point being pick one that makes sense to you. Pick one that works and that you understand and use that.

Mike Simon: 53:37

Use it internally consistently and I think almost anything published is going to be useful for you in this case. Last one is actually one I've not used—the cybersecurity academies, combination of the kill chain and the attack framework; sounds interesting, I've never found a need to actually use those as a combined resource. Okay. That chart view, at the bottom, is the complete list of essentially all of the super categories and the attack framework. As long as you all have the next 16 hours to walk through this. We'll go through each and every one of them. I'm looking at the clock though, thinking maybe we don't have that much time so we won't. What I will do is talk a little bit about the highest levels of each of these categories. Going to blow past this.

Mike Simon: 54:36

This is a representation of these highest levels. What I will do is dive into those examples of using categorization with this tactics list from the attack framework and how that applies to you and how you might be able to justify and explain how you would use a data source with this frame.

Mike Simon: 54:58

So probably the easiest thing to do is start looking at the physical sources that you have. Domain controller logs end point security logs from your desktops. If you have some level of endpoint security on desktops, your firewall logs, your net flow and so on. And then looking at the attack framework say, if I had my domain controller logs, and by the way, this is not exhaustive. I'm giving examples here. You would probably have a lot more in each of these categories. If I had my domain controller logs, what would I see in those domain controller logs with respect to the attack framework tactics list?

Mike Simon: 55:38

Well, pretty obviously you would see attempts or success at initial access. You would see very likely privilege escalation depending on how it's accomplished. If it's accomplished through an operating system bug that doesn't actually create a login event, you wouldn't. If it's a login event because of an operating system bug, you would see that. And again, I'm not going to exhaustively walk through this slide, but from firewall logs, again, initial access, but more interestingly from a firewall, you would, at least forensically, and hopefully detectively see exfiltration. You would see the event where an attacker who has gained a foothold started moving data through your perimeter to a third party.

Mike Simon: 56:29

So you get the idea of how you can take that attack framework and categorize physical sources here. There's another level of abstraction that I like to walk through that allows for almost management level discussions of this. So that you're not talking about something that kind of makes your fellow managers eyes glaze over. In terms of domain controller logs, you're talking about access control as a topic. So within access control, if you hand me access control logs, not too surprisingly I see the same things as I would in the domain controller logs because what do domain controllers do?

Mike Simon: 57:08

They, among other things control access. So this gives you another sort of abstraction type to use in terms of categorizing the various data sources, you might have. A little bit more of the same where we talk about the highest level for security analysis as opposed to operations, as opposed to, just doing monitoring of uptime and availability, network capabilities and so on. For security analysis, a quick cut on the high-level abstractions are endpoint security, access control, network devices, network device information in form of flow logs or ACL violations and security controls themselves, firewalls, intrusion detection systems and so on. If we were to break down a little bit on the end point bits here, you start to see the actual output of this abstraction categorization process. So in endpoint security, there are more categories than I have listed here, but let's say AV, UEBA, and host intrusion detection systems. Under AV, you may have one or more of these actual sources and not all of them are the same.

Mike Simon: 58:32

Then they don't even produce exactly the same kinds of information. But from an attack tactic's perspective, they may actually give you the same view, the same kind of information that you can then use in a rule to detect an event happening at that level. So what we're arming you here with is the answers to why; why this source? If in fact I can get this source and I can record it and I can record it for some period of time, which is important to me, what am I going to do with it?

Mike Simon: 59:13

I think if you can't answer the why or the what, as Brandon alluded to earlier, and actually one of the quotes that I cannot verbatim in my own head, one of the quotes Brandon used earlier was if you're collecting a thing and you are not able to watch it or use it, in some ways that's actually worse. You might have a false sense of security and there's actually other issues with, well you had this evidence in front of you and yet this harm occurred, you can actually incur some liability in that case.

Mike Simon: 59:49

So disc space is relatively cheap, compressed logs, not packet capture, but logs are pretty tiny and from a forensics standpoint it makes sense to collect more. From a detection standpoint, you need to have a reason. You have to have a specific detection-based reason for collecting, subjecting that thing to an analysis and then actions that are implied based on your analysis and what you detected.

Mike Simon: 01:00:15

Sorry for the pause. I'm trying to get the computer systems to cooperate. there we go. And my coworkers of course will be familiar with this. It gives us the ability to say, "No, I don't need that. I can't use it. Maybe somebody else can use it and that's great for them, but I don't need it myself." So and the nope is goal-based. It's not, "Oh that's a silly data source. Why did you even bring it up?" It's, none of us can figure out how this would tell us about an attacker in our network. If you can come back to me with how this would explain that to me, I'm happy to use this data source.

Mike Simon: 01:01:19

And again I'll reiterate the best conversation I have about this particular topic is with our in-house red team and really any red team that is actually creating the artifacts kindly for me. So what to do with this? Inventory everything you might get, pick the adversary model like we explained. Select those high level of abstractions so that you can actually have the conversation internally with stakeholders and with management so that I actually understand what you're trying to accomplish here.

Mike Simon: 01:01:52

And why, for example, you're asking your network or systems team to give you these data sources, because the converse conversation is also true. The conversation about, "Hey, if you don't give me this, I cannot possibly detect the following list of attack vectors." Also a very interesting and useful conversation to have. Walk through that inventory. Do the classification and walk through your classified buckets with the adversary model tagging things with that source.

Mike Simon: 01:02:25

I guarantee you, if you walk through these steps, that list that you end up with is something that a company like what Brandon works for that does red teaming, anybody that's doing a penetration test, anybody that's doing detection on your systems can have a really interesting conversation with you about why. You know what it is that you might be missing, "Hey, do you have the source because I don't see it in your categorized list.

Mike Simon: 01:02:54

And here's kind of some of those bits about how you might approach this. If you have an enterprise SIEM, take a look at the rule set and what you just created from that last slide. Make sure the SIEM is applying correlative and detective rules against the data as you expect. So is the SIEM in its rule set able to actually do the things that you expected from the attack framework classification?

Mike Simon: 01:03:20

If the answer is no, you've got some digging to do. If you are rolling your own, same thing applies except it's all on you. And if you have a managed service like what CI provides, that's the conversation to have with us. Here are the sources I'm providing you. What are you able to do with those sources? Are there caps?

Mike Simon: 01:03:47

I'm not really good at dumping reams of information on folks and this wasn't intended to be that. This was intended to be structure around the conversation. So I'll walk through this really quickly. High-level sources ranked, not surprisingly access control data at the top, security control data from firewalls, WAFs, IDSs, endpoint data from those various UEBA, AV, systems and so on. Correlative streams and actually I would include in correlative streams. Brandon talked a little bit about open source.

Mike Simon: 01:04:29

One of the things that I love to see and we do ourselves is take open source information and then as we see an event, as we see something in the data stream that is related to that event either by source, by URL, by IP, sub-net or so on and so forth. If there's anything in that open source Intel that I can correlate to an event stream, I like to be able to put those in front of analysts so that they can make better decisions. And then network content data, obviously the network-based AV ACLs reputation heads and so on. And then device data primarily for forensic purposes in terms of network streams.

Mike Simon: 01:05:18

And then I think the final kind of dump screen of the lists, what would be on those lists? Your domain controller logs, VPN logs, firewalls, switches, databases and IDS and HIDS. These are, by the way, both of these screens, this one and this one, are ranked at least in my opinion in terms of level of importance. And then of course in this screen you also have the attack framework classification for each of these source types.

Mike Simon: 01:05:52

And I'm not going to spend a lot of time on this because I know we're out of time, but let's see... Yeah, I think each you will get this deck or at least I've seen this deck. You can read this one yourself. With that, I'll conclude and hand it back to Carol.