How Can We Fix Alert Fatigue?

Useful alerts are critical in cybersecurity. But getting inundated with useless alerts wastes resources and our attention. How do we build out an alerting system that actually works?

Check out this post for the discussion that is the basis of our conversation on this week’s episode  co-hosted by me, David Spark (@dspark), the producer of CISO Series, and Steve Zalewski. Joining us is our sponsored guest, Itai Tevet, CEO, Intezer.

Got feedback? Join the conversation on LinkedIn

Huge thanks to our sponsor, Intezer

Intezer’s AI-driven solution automates alert triage and investigations, cutting through the noise to highlight serious threats. By integrating with your security tools, it escalates only 4% of alerts for fast remediation, helping SOC teams focus on what matters. Learn more at intezer.com today!

Full Transcript

Intro

0:00.000

[David Spark] Useful alerts are critical in cybersecurity, but getting inundated with useless alerts wastes resources and our attention. So, how do we build out alerting that actually works?

[Voiceover] You’re listening to Defense in Depth.

[David Spark] Welcome to Defense in Depth. My name is David Spark, I’m the producer of the CISO Series. And joining me for this very episode, all the way from Boston, Massachusetts, my hometown is as well, it’s Steve Zalewski. Steve, say hello to the audience.

[Steve Zalewski] Hello, audience.

[David Spark] More from Steve in just a moment. Our sponsor for today’s episode is Intezer. Extend your security team with AI. And we’re going to be talking about that today. And in fact, our guest comes from Intezer, more on that in just a moment. But first, Steve, let’s talk about alert fatigue. It is a real problem.

No one wants to miss a critical alert. So, by default, we cast an overly wide net. This ensures we get all the alerts we need, but also a lot of noise we’d otherwise like to ignore. Automation was supposed to get us to the promised land, but these systems inevitably need humans in the loop, which means someone is still getting alerts.

So, Steve, you said on LinkedIn that there didn’t seem to be any consistent agreement on the problem itself, let alone a solution. So, how do we start defining the problem in order to solve it?

[Steve Zalewski] And I would say that was the whole reason for the post and why we’re talking today.

[David Spark] We got some good answers on that question too.

[Steve Zalewski] We really did. This was another one where I got almost more comments than likes, and so clearly a topic of high interest.

[David Spark] So, that means they like to comment on your post, but they didn’t like your post.

[Steve Zalewski] That could be, see, you know what I mean? But I’ll take active comments over liking any day, okay? Because now we’re going to solve the problem, right? But that was just it, which was what’s the problem? And it was so fascinating, and we’re going to talk about this, to see all the comments as to, it’s not that we’re getting all the alerts, it’s not that we’re getting the right alerts.

It’s that humans are generating more alerts. It just seemed like, once again, we’re trying to measure the wrong thing.

[David Spark] We will come to some very interesting solutions in the discussion because of the clear understanding of the problem, and the person to help us with that very discussion is our sponsor guest from our sponsor Intezer. It is the CEO, the man who started it all over at Intezer, none other than Itai Tevet.

Itai, thank you so much for joining us.

[Itai Tevet] Great to be here, David.

What would a successful engagement look like?

2:42.947

[David Spark] Nathan A. Larson said, “Ignore what the vendors tell you, 100% accuracy does not exist any more than 100% secure. Trust the architects and vulnerability engineers. Focus on the irreplaceable assets the business cannot lose. Protect against threats that are most likely, easiest, automated.

Review, revise, repeat.” And Subbarayudu Darisipudi of Optiv said, “Generating 1,000 alerts a day when capacity can only handle 100 is begging for trouble. Business and SecOps is not aligned. This assumes that you have the basics taken care of without which this is all meaningless.” So, I want to land on Mr.

Darisipudi’s comment right here. First is, let’s just put this aside. Let’s just assume we have the basics in place because if we don’t, then this really muddies up this whole discussion right now. So, I just threw that line in there for that purpose. But that line of getting 1,000 alerts a day when you have capacity for 100, I think that boils it all down.

That is the problem right there. Steve, yes?

[Steve Zalewski] Yes. Because it’s not actually a capacity problem, right? The goal is not to hire more resources to then be able to handle 1,000 alerts because if you do, you’re just going to generate 2,000 alerts. Because I mean, what he’s getting at here is, are you alerting on the right thing, not just generating noise?

And I think a lot of the SOC historically has been a noise generator, not a value generator.

[David Spark] That’s a very good point. All right. Itai, you were very much nodding your head at Darisipudi’s comment about 1,000 alerts when you can only have capacity to handle 100. I mean, it gets to the thing, well, we need to capture everything. It’s like, well, you’re just asking for trouble. Let’s say even if you do it, how are you going to solve the problem?

You’re not.

[Itai Tevet] Yeah, I mean, every company has eventually a lot of different problems, but it comes down to one common denominator, which is, “I feel I need more people on my team.” Now, that is really the common feeling. Now, we can talk about whether that’s the right thing to do to add more people, but essentially the talent shortage is by far the most common problem that I hear from almost every CISO that I talk to.

[David Spark] Let me also comment on Nathan Larson here is, Nathan said this is the problem. You got to lean on the engineers to look at it. If things are not tuned well, you got to fix it and keep fixing it and keep fixing it. I mean, it’s rarely perfect out of the box, is it, Itai?

[Itai Tevet] It’s never perfect out of the box, and I would argue that even after working very hard to tune and fine-tune, most organizations are struggling to get to that point where it’s okay, right? It’s even manageable. I think that it’s always a challenge, even for those who spend a lot of resources to continuously fine-tune.

[David Spark] So, even the fine-tuning is coming too short, and is that just because we’re just still dealing with too intense a volume?

[Itai Tevet] Yeah, that’s true. So, basically, the more that we progress as an industry, the better the detection tools become, and they generate more noise because there’s a lot of anomalies out there, and they are doing a good job. But then the SOC team now handles a lot of abnormal things that are happening in the network, which makes sense, but obviously, it’s a very big challenge to handle the volume.

[David Spark] So, it just becomes a more higher volume of “abnormal.” You’ve got to fine-tune that issue as well.

[Itai Tevet] Exactly.

How can we automate this?

6:32.869

[David Spark] Bil Harmer, CSO over at Craft Ventures, said, “Stop treating them as alerts and start treating them as data points. Some data point will cross a threshold either individually or in combination. Where in the world of finance do we send the data? To the traders, not customer support. This is why I firmly believe that some form of AI/ML needs to be put in so this data is consumed, correlated with other business relevant data points, this is key, and risk-based decisions are made.” And Brian A.

of Tanium Cloud said, “Alerts are just another set of data, just like logging. Treat alerts as observables from your various tools, make searchable, build your detections, time windows, and attributions on top of those datasets to create actionable events for your automation and humans to investigate.

While an alert may not be actionable in initial detection, it may have value as you build your attribution to a possible event as you build out your timeline.” So, Steve, I throw this to you. This is saying just stop treating them as alerts alone and stop treating them as anomalies. They have to be correlated to other behavior in the business.

[Steve Zalewski] Yeah, but even that’s not good enough because if I look at what Bil Harmer said, right, and what we were talking about just earlier, which was if you think your job is to secure your company, then you’re trying to put a protective envelope around everything that’s happening, and therefore you want to see everything, and you never want to forget anything.

And that just turns into this giant mass of data and a lot of noise. And yet what you’re hearing now is, “So, let’s not protect the company, let’s protect the business.” And let’s talk about what’s important to the business. And now let’s talk about alerting relative to how the business will react to the alert, not how security will investigate the alert.

Now that’s a lot that I just said, but what you’re hearing is let’s do less with less. Let’s pull in less logs, let’s protect the business, not secure the company, and let’s make sure that the business understands what to do when we find something to tell them about.

[David Spark] Itai, you were literally nodding at everything that was just said, from the quotes to Steve. So, I don’t know where to begin. I’ll let you begin. [Laughter]

[Itai Tevet] Wow, it’s hard to pick exactly where to begin. Well, generally speaking, both from the quote of Brian and Bil, they’re basically saying, “Hey, don’t treat every alert as if it’s a big red siren running off, right? Treat them as logs.” I think we’re actually a bit past that as an industry.

I think most people treat those items, logs coming out of those solutions as just logs. And then they use something like a SIEM in order to create rules on top of the initial logs. So, there’s always an opportunity to refine the rules and detections, but the problem is that eventually, what comes down to humans at the end of the day, after refining, after fine tuning, is still a lot.

So, I think that for the question of how we can automate this, for me, the way that automation can actually work, it has to simulate somehow the human decision-making process. Otherwise we’ll be in the same kind of race to chase after ghosts and after logs.

[Steve Zalewski] I’m going to dovetail on this, right? Which was, we use the word “logs.” We’re not interested in logs anymore. Logs tell us what happened. What we’re obligated to do now is to have the context to identify what’s happening, and that’s not a log, right? That’s an action. And so a lot of this conversation I see is we need to automate finding the right context to action when something’s happening, not just alert on something that’s happened.

So, there’s this philosophical change that’s going on as well to what’s important to the business, and how do I report it in a business term? So, has a business user been compromised? Has business data been compromised? So that I can demonstrate to the business I’m taking an action that actually demonstrates the value of the security organization.

Sponsor – Intezer

11:18.087

[David Spark] Before I go any further, I do want to tell you about Intezer because if you’re interested in this topic, you’re going to be interested in what I’m about to say. So, alert triage and investigations are time-consuming for security teams. That’s what we’ve been talking about. It’s not like I’m telling you anything new.

But it does not have to be that way. That’s why you want to listen. Smart security teams are using AI to automatically investigate alerts from their security tools 24/7 with an average triage time of just two minutes.

So, how does this work? So, Intezer is an innovative platform that integrates with your security tools to monitor alerts, collect evidence and investigate every artifact. When Intezer uncovers evidence of a serious threat, it escalates the findings to the SOC analysts. Intezer also reduces noise, correlates alerts, and automatically resolves over 90% of false positives.

This means even low severity and informational alerts get investigated. You aren’t wasting time on false alarms, and you have actionable recommendations for serious threats. Intezer extends your team by emulating an experienced SOC analyst with an AI framework backed by years of industry research. It’s designed to be cost effective and an easy-to-set-up platform that provides detailed, transparent results that SOC analysts can trust.

No playbooks, no chatbots, no engineering to set up. With Intezer supporting your SOC analysts, your team can eliminate alert fatigue, uncover hidden threats, and stay focused on what matters most. AI won’t replace your SOC team, but it can be a game changer. Go see how it can happen. Go to their site, intezer.com.

What’s the most critical issue?

13:15.824

[David Spark] Andrew Wilder, CSO over at Vetcor, said, “The problem with alert fatigue is all about fear.” By the way, he nails this one. “Fear that something bad will happen and you’ll discover that you had an alert for it, but you didn’t pay enough attention to it. There are numerous examples of 20/20 hindsight related to this dilemma.

So, the key thing here is, whether generative AI is involved or not, is trust. Trust that the tuning or the prioritization will tell you simply which alerts do I need to drop everything and address, and which can I ignore completely, and everything else in between.

Jason Keirstead of Simbian said, “Alerts come from detections. So, you cannot discuss how to triage alerts without looking at the source of the problem in the first place. Detections can be either specific or generic. The more specific a detection is, the less likely it is to create false positives that bog down the SOC.

However, the more generic, the more likely it is to detect novel threats. So, this is an eternal trade-off, which is never going to go away. The problem is a program management issue. Every time alert is closed as a false positive, an action, either human or AI based, should be being taken ‘such that specific detection never creates a false positive for that reason again.'” All right, that’s what I want you to lean on right there, Itai, because we talked about the need to constantly tweak your system.

I know you have an AI solution. Is it doing that very problem right there?

[Itai Tevet] Yes, generally speaking, I agree that security operations has to be an iterative process. Constantly fine-tune detections based on new learnings. It’s very similar to software development. You always fix bugs, and when you develop a new feature, you need to see how it performs in production.

And at Intezer, we give this quite a lot of attention. Actually, we just released a new feature, I think last month, that highlights the lessons that we learned from your environment in our dashboard. So, it’s not just about automating alert triage, but also giving you insights out of it so that next week you’ll have maybe less alerts to deal with.

And maybe just one point about the first quote, the issue is trust, right? He said it’s all about the fear. There is a saying that there’s a difference between your fear and phobia. Phobia is irrational, right? So, at the end of the day, it’s a rational fear that they’re going to miss something from the detections that they put in place, and that’s really the challenge.

[David Spark] All right, Steve, this is a thorny thing. And I like Jason’s comment about the fact the more specific it is, the less false positives, but the more generic you’re going to detect the more novel threats. And Jason saying, “This is never going to go away.” And the fear is, like what Andrew points out, is like there’ll be a 20/20 hindsight moment of like, “Oh, they saw this alert.

They should have known.”

[Steve Zalewski] Yeah. So, the problem with fear and the fear of the unknown is you’re always looking over your shoulder as to what happened and how you’re accountable, okay? And what Jason is saying is, “Never mind that. Embrace the fear,” okay? You can’t solve it. So, let’s focus on what we can do, right?

And let’s position that we’re not here to secure the entire company. We’re here to protect the business against certain types of threats, and let’s embrace and celebrate the fact that we’re doing that, as opposed to just living in fear for all the things that we don’t know. And I’d say that shift is great.

But the other problem we have with that is time is not our friend, it’s our enemy. Because even when we get the right alerts, we only have a certain amount of time before the next one comes in, okay? So, it’s kind of like the emergency room doctors, which is some cases are easy, some are hard, but there’s always the next one coming, and so you never feel like you have all the time you need to fully address the challenge before the next one comes in.

So, I think both of them are right. But what we’re walking away from here is focus on the right problems and embrace the fear for what it is, which is there just isn’t much we can do about it. Let’s focus on what we can do.

[David Spark] Getting to the what we can focus on doing, Itai, we want you to close this segment up, specifically focusing on what Intezer is doing. Can you boil it down? Like this is the aspect of the problem of alert fatigue that we have been able to manage better using AI. Walk us through that.

[Itai Tevet] Of course. So, fundamentally, our company is using a combination of both AI and other technologies because AI is not a silver bullet for everything, but a combination of deterministic and AI technologies in order to basically simulate the decision-making process and alert investigation process that human analysts go through.

[David Spark] I get the sense there’s a strata that AI can handle. There’s a certain strata they can handle, and a certain they can’t handle, and you’re essentially pulling that out to essentially reduce the fatigue.

[Itai Tevet] Exactly. More than 90%, if not 95% of the workload that SOC teams need to handle are eventually those mundane tasks, repetitive tasks that they need to do, and that’s probably, again, the vast majority of their job. The cases where they actually need to use their top-notch intelligence to really solve a forensic case is fairly rare compared to the day-to-day, which is mostly chasing after ghosts and false positives.

So, that is exactly what can be automated and should be automated.

What’s the optimal approach?

19:27.050

[David Spark] Mihir Mohanty of Stellar Cyber said, “We are fatigued with alerts because the solution to an alerting problem requires a system approach whereas we try to solve it as a point problem. The detection mechanism that generated the alert did so because some kind of rule match but couldn’t see beyond its slice of view to the context…” is kind of what we talked about earlier, “…both organization and threat side.

Where did it come from, the threat? Is it relevant? What’s the impact, etc.? It’s like solving a multinominal system with half of the variables. Another part of the challenge is the feedback loop that could throttle the volume down,” and we talked about this as well, “We don’t have any because we don’t have a system.” So, essentially, Mihir’s comment is really kind of summing up everything we discussed here, Steve, saying, “Look, these are all a series of problems, but if you don’t have the whole system on board, you’re going to still have this series of problems.” Steve?

[Steve Zalewski] Steve. So, I equate this to the difference between playing chess and playing checkers.

[David Spark] That’s a good point.

[Steve Zalewski] Where we are right now is we’re pretty good at playing checkers, right? For the run books that we have, we know the moves, we find a few interesting things, and we miss a few interesting things, and we find a lot of uninteresting things. But it’s discrete. We kind of know the game. And what we’re seeing here with the systemic exercises, we have to get better at playing chess, and that’s where AI is coming in.

Where we’re having to be able to look at what we have, know what additional things we need. So, that we’re playing the long game, right? To be able to ultimately win. But at any point in time, the chessboard is kind of a stalemate. All right, Itai.

[David Spark] All right, Itai. Your take on this whole system view of this problem, not the isolation of alerts and SOC engineers. Because that is, I think, the narrow view most of us see.

[Itai Tevet] Yeah, I agree. I have a few thoughts on that, and I actually want to challenge that a bit even. So, I agree that a lot of alerts, if you look at it as a point problem, you’re not going to get to the right conclusion. Because you have to have additional context and a wider view about the impact and what did you see at other places and so on.

However, I have to say that from actually viewing many different organizations’ security operations and their alerts, you can see that about half, at least, if not more, are a point problem.

I’ll give you a simple example. Let’s say a file that was alerted as a ransomware from your EDR solution and it’s not. There’s very little additional context or correlation that needs to be done from a wider point of view in order to get to that conclusion. So, the way that I’m seeing how we can actually solve alert fatigue, we must find a way to cover both.

Cover the point problems that are being alerted to us, as well as the way to correlate, see things in a wider context and so on and so forth. I think that you have to have both in order to really make progress from an automation perspective.

Closing

22:50.536

[David Spark] Well, that brings us to the end of the show and the portion of the show where I ask both Itai and Steve to tell me which quote was their favorite and why. And I will begin with you, Itai. There were a lot of good quotes here, brought up a lot of challenging discussions. I’ll ask you which quote was your favorite and why.

And you already told me, and it’s the second gentleman, Subbarayudu Darisipudi of Optiv.

[Itai Tevet] Yes, I think his quote catches the essence of the problem here. You can’t expect to have thousands of alerts while your team capacity is 5% of that. You have to either retune the systems or automate the investigation or a combination of both.

[David Spark] Yeah, I mean, I liked his thing. It really pointed out to everybody, it’s like, yeah, you can keep throwing alerts and 90% of them are never going to be dealt with. [Laughter] Yeah, knock yourself out. Then when we get hit and said, “Well, they knew about it.” Yeah, it was probably part of the 90% we could never deal with.

Steve?

[Steve Zalewski] There were so many good ones for this episode. You were right, you said in the beginning, the quotes and the different facets of the problem. So, like Nathan Larson was good and Bil Harmer was really good. But at the end of the day, I’m going to go with Andrew Wilder. The problem with alert fatigue is all about fear, and I just think the alerts and everything else, the underlying concern is we got to get over the fear that we’re going to miss something, and we really have to look at what it is that we’re trying to find.

And if we can embrace that fear and just get better at being emergency room doctors, I think a lot of the questions, a lot of the challenges, start to simplify.

[David Spark] And by the way, what I love about his quote too, and let me double down on this is, we’ve heard the, “Oh, we’re going to gather it all because we don’t want to miss anything.” Like, we’ve heard that line umpteen times. But he really points out the real fear, which is when it does happen and it does get pointed out, who is to blame?

No one goes, “Well, they knew about this alert. Obviously, it was the system process that is to blame because they’re collecting too many alerts.” Nobody says it. They said, “No, they knew about it, and they should have dealt with it.” No one says that.

[Steve Zalewski] When you’re an emergency room doc and the patient comes in and you’ve got to be able to decide in four minutes, do I give him an X-ray? What do I need to be able to determine the course of action because he stopped breathing? Okay? That’s the point. Oftentimes the patient stopped breathing for us, and so we’re trying to figure out what to do.

And afterwards saying, “Well, wait a minute. If you had done the blood test, or the blood test said something, but that wasn’t what you could focus on, it’s just the realization that that’s the environment and the fear that we are all working to address. It is not normal course of business for most people to have to do something like this.

[David Spark] That brings us to the end of the show. I want to thank your company, Itai – Intezer, extend your security team with AI. Remember, go to their website, intezer.com. If you listened to this episode and you’re concerned about this topic – who isn’t? – then do yourself a favor. Go check out what they’re doing at Intezer.

All right, thank you again, Itai, and I’m going to let you have the very final word here. Do you have like a free offer? Do you have a demo? Are you hiring too? Let us know. Please tell our audience.

[Itai Tevet] Yes, yes, and yes. Everybody can actually just sign up for our product without talking to any salesperson. But if you are a fan of salespeople, you can definitely watch a demo with our team. So, looking forward to speaking with everybody.

[David Spark] By the way, they’re very nice salespeople over at Intezer. Thank you so much, Itai, and thank you so much for sponsoring the CISO Series. Thank you, Steve, and thank you, our audience, as well. We greatly appreciate your contributions and listening to Defense in Depth.

[Voiceover] We’ve reached the end of Defense in Depth. Make sure to subscribe so you don’t miss yet another hot topic in cybersecurity. This show thrives on your contributions. Please write a review, leave a comment on LinkedIn or on our site CISOseries.com where you’ll also see plenty of ways to participate, including recording a question or a comment for the show.

If you’re interested in sponsoring the podcast, contact David Spark directly at David@CISOseries.com. Thank you for listening to Defense in Depth.

David Spark
David Spark is the founder of CISO Series where he produces and co-hosts many of the shows. Spark is a veteran tech journalist having appeared in dozens of media outlets for almost three decades.