How Can AI Provide Useful Guidance from Fragmented Security Data?

July 31, 2025

How poorly fragmented is our security data? We’ve got a lot of it, but connecting it, having it understand from each other, especially as we’re using AI is quite daunting. But if it were all connected, we could gain greater insights as to what’s happening in our environments.

Check out this post for the discussion that is the basis of our conversation on this week’s episode co-hosted by me, David Spark, the producer of CISO Series, and Steve Zalewski. Joining them is their sponsored guest, Matt Eberhart, CEO, Query.

Got feedback? Join the conversation on LinkedIn.

Huge thanks to our sponsor, Query

Query is a Federated Search and Analytics platform that builds a security data mesh, giving security teams real-time context from all connected sources. Analysts move faster and make better decisions with AI agents and copilots that handle the grunt work and guide each step. Learn more at query.ai

Full Transcript

Intro

0:00.000

[David Spark] How poorly fragmented is our security data? We’ve got a lot of it, but connecting it, having it understand from each other, especially as we’re using AI, is quite daunting. But if it were all connected, we could gain greater insights as to what’s happening in our environment.

[Voiceover] You’re listening to Defense in Depth.

[David Spark] Welcome to Defense in Depth. My name is David Spark. I am the producer and host of the CISO Series, and joining me as my co-host, it is none other than Steve Zalewski. Steve, say hello to the audience.

[Steve Zalewski] Hello, audience.

[David Spark] That is his brand. That is his Johnny Carson golf swing of this show, if you will.

[Steve Zalewski] [Laughter]

[David Spark] Our sponsor for today’s episode is Query AI. Security data is everywhere. Put yours to work, and in fact, they will connect all your security data so you can actually make sense and understand what’s going on. And in fact, this is what we’re going to be talking about on the show, and we actually have a guest from Query AI. But first, before we make that introduction, Steve, let’s set up today’s topic. As you pointed out on LinkedIn, security data lives everywhere. So, it’s in endpoints, identity systems, cloud platforms, threat intel feeds, business applications, vendor services. It’s fragmented, siloed, and in formats who do not play well together. Therefore, complete context is elusive, even when we’re not talking about all this AI data. So, the result is AI tools, their LLMs are only seeing part of the picture, which misses the mark and the value of these tools. So, I have to ask you, Steve, do you think we’re kind of shooting ourselves in the foot because we are not getting this complete context? Like we see this value of AI and all this wonderful data that’s coming in, but if it’s not managing it all, looking at it all, it’s kind of a giant missed opportunity, yes?

[Steve Zalewski] And that’s just it, right? Is what we’re appreciating is not just the trust in the data, but the breadth of the data and the velocity of the data because we need to see it to make better decisions. But if we don’t see the right amount of it, we’re also making wrong decisions. And I think that is where I was coming from when I posted that is, this is not about making your SOC better. This is actually about transitioning from how am I efficient in running a security organization to how am I effective at protecting my company, and the location and the type and the velocity of data that now has a security moniker has just increased dramatically. And now I reached the “so what, now what” stage.

[David Spark] To help us with this discussion, and by the way, I was very impressed with the comments that came in. And so, you’re all going to hear it and hear the insight from Steve and our guest, who is the CEO of Query AI, our sponsored guest, none other than Matt Eberhardt. Matt, thank you for joining us again on Defense in Depth.

[Matt Eberhart] Thank you, gentlemen, very much for having me. It’s my second time. I don’t have a normal entry, but I’ll work on it for next time.

[David Spark] Oh, yeah, yeah. You need your equivalent of the golf swing. Yeah, we’ll work on that for you.

[Matt Eberhart] Thank you for having me back.

What are the elements that make a great solution?

3:28.122

[David Spark] Daniel Gorecki of NGC Risk said, “It’s not educated guesses versus trusted facts, but an informed decision. While the data is everywhere, does it all add value to your decision-making process?” I mean, this is right in line with what Steve was saying up front. “I’d prefer to have less data that is accurate to work off of to make an informed decision. This is the case whether you have AI and LLMs in the picture or not. So, spending time understanding your threat model and risk appetites will allow you to focus on the data that matters to feed into your security AI initiatives. I may not need all those data points to make a high certainty decision. It’s about optimizing that decision path and making the best-informed decision based on the data you have at hand at the time.” And then also let me mention Ezra Ortiz who’s a consultant said, “If humans pick up the phone and ask a partner, are you seeing X? Do the different AIs communicate with each other before making an informed decision?” So, this is just like what you were saying, like are we all seeing the same thing? Would we all make the same decision here? What do you think, Steve?

[Steve Zalewski] The way Daniel characterized that I thought was really, really good because it comes back to the what are we trying to accomplish with the data now, as opposed to simply trying to be able to answer the question, do I see all the data? And as part of that, now we’re asking about what is good enough data? What type of decision do I need? But we realize to a certain extent, we need to go get the data when we need it at the speed that we need it for the context that we need it. And that when you think about the querying facets is true for a human, is true for AI, is true for pretty much any kind of decision-making function that from a security perspective, we’re trying to provide.

[David Spark] I throw this to you, Matt. This is so darn basic, but this really basic concept is a little elusive, isn’t it?

[Matt Eberhart] It is. In many ways, the security industry has been chasing the right data and the ability to understand that data for decades. And I think Daniel makes some really great points and Steve touched on this as well. When you’re trying to orient to a situation, you need the right data in order to make the best decision. And when you really get underneath the problems in cyber, and I think this extends to problems all around us, simply getting the right data is often the hardest part, both for people and AI systems. And working on problems like this, we’ve learned that teams need to ask different questions of data in order to really arrive at whatever the answer that they’re looking for. But that’s what they’re looking for, David, is an answer.

[Steve Zalewski] And I’m going to spin on this because of what Ezra said because he brings up the AI. Here’s the challenge. We were solving it for a human. A human has a certain velocity with which they can absorb information. They have a certain speed with which they can determine insights and a certain speed to action. And what we’re doing now with agentic workflows is putting huge multipliers on that, which means we need that data from those three axes even better.

How can we go faster?

7:06.337

[David Spark] Matt Muller, a field CISO over at Tines, said, “Are we applying AI to the right SecOps processes in the first place? Now, I can have an AI model close all the false positive alerts I want, but if I only have it close the false positives, then there’s a good chance I’m masking a flawed detection engineering process. In practice, I agree with you that it seems like many organizations are going through a decision-making process of model, data, and process, but I think they’d be better results if they flipped it around.” Then that would be process, data, model. Good point. And Evan Powell of Deep Tempo said, “Even step function improvements must serve existing workflows that formed part to address the poor fidelity and lock-in based business models of prior cybersecurity systems.” So, the last part of Matt’s quote and Evan’s quote here is… Matt, this is a really good point of, you don’t all of a sudden revolutionize your entire security program just because AI showed up, right? Your processes, if they’re still broken, are going to be broken with AI.

[Matt Eberhart] Yeah. [Laughter] In fact, AI is kind of like automation in that regard. You might just be able to break a lot more stuff faster.

[David Spark] Exactly. And I think Evan’s comment of, you can’t all of a sudden do this massive overhaul. So, I mean, I’m assuming you do this with your customers as well. What are often good first steps?

[Matt Eberhart] Yeah, well, I think one of Matt’s points is the starting point, which is you really need to use AI to solve the right problems. And so, like you said, foundationally, where do you start with that? Well, what we hear from our customers and users is that security teams are looking for a desire to apply AI to the tedious high-frequency work. Like, we’ve all heard versions of the quote, like, “Have AI do the dishes for me,” not do the things that I like or love doing. And in security, the jobs to be done require a lot of detailed work, sifting through high volumes of alerts, looking at what may seem like the same events over and over again with just tiny variations, where those variations could mean nothing or frankly, they could mean everything. And so, that’s how we look at, first, how can we help make sure the right data is being used to give a data-driven answer to the question? And then second, how can you orient to that data as quickly as possible so that you can make the best decisions possible?

[David Spark] Steve, your take on the first step of organizing this data that you want to look… It doesn’t have to be physically in the one place, but it has to be able to come to one place so it can be searched together. So, what’s your sort of first step?

[Steve Zalewski] And it’s when you said, “Come to one place,” which means I don’t need to aggregate it in one place. I need to be able to access it at a policy information or policy decision point. Leave it where it is. Get what I need for the metadata decision-making and stop trying to move a lot of data and move the right context. Now, that said, in Steve’s simple mind, the real value here is, how do I remove the human drudgery from the efficiency of a complex workflow process? Because what we’ve done over 15 years, like if you look at the SOC as an example, is we’re building ever more complex workflows. That is a lot of drudgery of trying to find that data, aggregate it, right, and build the context. And so, the huge need we have is how do I remove a lot of that drudgery by being able to ask a more complex question, where the data that I need to answer it just is much more extensive. And I’m like, that’s stage one. That’s where everybody can go because you’re trying to do more for less around the human aspect, right? That’s where you start. If you can get that traction, then you can go talk about AI and whatever permutation. But to me, this context really is, heck, for most companies, we’re still at the human drudgery level of what this is trying to address. Never mind go ahead and try to go faster.

[Matt Eberhart] Yeah, I think that’s a great point. At Query, we didn’t set out initially to build agents or even use LLMs to help people understand data. We started out with the simple mission of security users have questions. The answers are always data-driven. That data is now spread all around the organization. And so, Query is a data gateway that brings your questions to your data and returns you an answer. And so, it’s a great point about the world is shifting that if we continue to try to rely on always centralizing data and then only using what was centralized to get our answers, we’re going to miss things and we’re going to miss a lot of things.

Sponsor – Query

12:40.906

[David Spark] Who’s our sponsor this week? Well, our sponsor this week is, well, it’s Matt’s company, it’s Query AI. And let me explain some things if you’re not aware. So, first of all, for decades, we all know that security teams have been chasing context. That’s not new to anybody here. But trying to piece together the right data at the right time to make the right calls, that’s a big problem. I mean, that data lives everywhere in different tools, formats, and silos. And we’ve been talking about this. And more often than not, it’s analysts who are left trying to stitch it all together manually.

And this is exactly where Query AI comes in. Query AI is a federated search and analytics platform that connects directly to your distributed data. So, no ETL pipelines, no centralization required, just API connections to the tools and systems you already use. It creates a security data mesh, bringing the power of your existing stack together to deliver real-time context without the heavy lift. Even better, being more efficient with your data leads to lower SIEM costs. Look, mission-specific AI agents and copilots, they can handle the tedious parts, triaging alerts, pulling in contextual data, enriching results, and surfacing next steps. So, teams make better decisions faster. Look, security data is everywhere. Put yours to work and you can learn more if you go to query.ai, and when you go there, let them know that the CISO series sent you.

What’s the optimal approach?

14:18.182

[David Spark] Kunal Pachauri of Amazon said, “AI in security is only as smart as the data and relationships it can see. That’s where I think graph-based security models can shift the paradigm. So, graphs excel at connecting siloed data across identity systems, endpoints, cloud assets, threat intel, and more, and presenting that as a single queryable security context. Now, instead of forcing AI to ‘guess in isolation,’ we give it a relationship-aware foundation that represents real-world attack paths and trust boundaries. Without that connective tissue, even the best LLMs are just blind copilots.” Valentina Brysina of Digital Pipl said, “Without unified high-quality context, even the best LLM is just guessing,” just like what Kunal said. But Valentina goes on to say, “I’d argue the next real innovation isn’t another copilot, it’s solving interoperability across the stack.” And lastly, Matt Svensson of GetReal said, “Centralized ingestion of data won’t scale to meet this change. Data needs to be searched where it natively lives.” All right, Matt, all three of these are kind of speaking your language here, aren’t they? I just want everyone to know we did not pay anybody anything. Correct me if I’m wrong, this is in line with queries theory or business model or what you’re offering is as sort of straight as possible. Am I right here?

[Matt Eberhart] A hundred percent. I mean, this is the problem that we’re solving and the problem that we’ve founded the company to solve.

[David Spark] So, people understand because Kunal, Valentina, and Matt, they all brought up the issue of, look, you can’t bring all the data in one place. Because by the way, many tools have tried to do this. Let’s just copy it and throw it in a big area. And by the way, there are solutions through data lakes that some have found quite valuable as well. But it’s also talking about the context as well, which we’ve been talking about through the show. So, explain how Query AI operates here.

[Matt Eberhart] Yeah, absolutely. The experience on top is very similar to using a SAM or a data lake or a copilot or agent-driven SOC tool. What’s very different is how the data is sourced. So, instead of first centralizing all of your data and having to build pipelines and make decisions on what data gets ingested and all the things that come with that, think of query as an API gateway. We reach into the data that’s located all across your environment and let you ask questions of that data and ultimately get a cohesive answer from all of the interconnected sources. And so, Matt Svensson, I mean, his quote definitely had me search the data where it lives, right? I mean, that’s so much more efficient. Like you have all this data in different places, searching it where it lives, you’re getting a real time answer, but you also don’t have to deal with the cost and challenges of centralizing data into one place or what’s starting to become today is now people are centralizing data into five or six different places, and then that just creates even more data silos.

So, the other piece of this that I think this hits on is you really do have to have high quality context. In order to have that, you have to have the ability to understand data that may be in lots of different formats. And we do that by normalizing to a model called the Open Cybersecurity Schema that lets you basically have a schema that can be used to understand data that’s going to be in different formats. Different vendors use different formats for their data. And so, by presenting a normalized data set to a normalized answer to your question, that’s really what’s required for LLMs to perform. If LLMs have to make a bunch of decisions on how to switch between different data formats and all of that, they tend to either hallucinate or just produce a lot of errors.

[David Spark] One of the things that came up here is it’s one thing to ingest. Let’s ingest all these data sources. But it’s also another thing, Steve, to understand, okay, I got data source A, data source B, data source C. What the heck do they have all to do with each other? That’s key. Who’s making that decision, Steve?

[Steve Zalewski] [Laughter] The way you said that, I don’t think you realize what you said, okay, is that the way I see this is that we’re looking at an hourglass, and where this ability to characterize the context and ask the questions is actually the choke point in the hourglass now. Because you need this, not just to be able to go see all the data below you, for a human to ask an interesting question, but if you’re thinking about it now with AI, AI is actually the top part of the hourglass, and it needs that query engine to be able to support its ability to provide context and decision making. So, what we did was this security mesh went from being the lowest level of a decision tree to actually assuming a role in the middle where the decision tree forcing itself down for visibility and up for decision making, this just became table stakes. So, this is not a conversation of we need it. This is a conversation of we have to just have it because we have to move on to the next level of problem.

[Matt Eberhart] I think that’s right. And I think there’s a number of different conversations happening in the market right now about like which schema should be adopted. Listen, I mean, I think that’s interesting, but at the end of the day, if you’re building a product or if you’re building a program, you need to adopt a model. And really, the ability to use that model to then build your capabilities and systems around is I believe what’s really important. And I think that comes out in Valentina’s quote, right? That’s the only way you can get high quality context. It’s like you could pick any language to translate to, but you need to pick one.

No one said it would be easy.

20:46.054

[David Spark] Ashish Popli of Defendermate said, “Not only is it hard to gather all relevant contexts in one store for an AI model’s prompt context,” essentially what you were just talking about, Steve, “It’s a completely different and much harder task to align, normalize, deduplicate, and schematize it such that hallucinations and best effort guesses can be minimized. Context isn’t cheap, nor is inference, but results when those two are combined promise to be mind-blowing, liberating, and offer a chance to reduce the asymmetry between offense and defense.” And I think what Ashish is just saying here is, oh, it’s worth it to do this. Yes, Steve?

[Steve Zalewski] What he’s really saying is, if we don’t do this, how do I weaponize my defense to be able to work at the same speed that we’ve seen the bad guys weaponize this for offense? It’s not, no one said it would be easy. It’s, we have to stop making this hard because if we don’t, we’re losing as it is, and we’re just going to lose even more.

[David Spark] That’s a good point. Let me go over these things where he says, align, normalize, deduplicate, and schematize it. What do you think’s the hardest part, Matt?

[Matt Eberhart] I mean, I think it’s all hard. If we back up a second, I think the term context, like if we could replay RSA conferences for the last 15 years, the term context has been used so many times.

[David Spark] Yes. Well, also, there’s the, “Say it, now go solve it.” There’s a big leap between the two [Laughter] of them.

[Matt Eberhart] Completely. And so, like to me, context is the ability to understand the data or what you’re looking at, and then do something useful with it. And the do something useful with it part is key. And so, when we think about context at Query, we think about it in a few ways that I think really align to his response. First, look, we normalize and enrich results at time of search, not after centralizing them. So, we can give you an answer from 50 different places that are normalized and enriched together so that when you view it, you can easily see and understand what you’re looking at. Second, like where the data comes from these days, like that’s increasingly challenging. Most teams are adding more and more tools and data sources every week, every month. You need more data now to make decisions. You need to understand identity and asset data deeply. And even if you just pull the thread on identity, of course, it’s not just human identities anymore. Now it’s machine and system identities. And when you have agents that are running around doing things on your behalf, that makes that problem even more difficult or creating context from that even more difficult. And so, by providing a lot of contextual insights about what’s happening in the situation, who is this user? What have they done? What assets do they have? Where else have they touched? What do we know about the attacker? The list goes on and on, right? When you put all that together in a way people can understand it, that’s really where I think context gets delivered.

[David Spark] And I just want to stress, this is what you’re dealing with when your own product and you’re working with customers. So, this is a good conversation to have with Matt. And we’re going to get to that in just a second.

Closing

24:23.299

[David Spark] But first, Matt, I’m going to ask you this question. A lot of good quotes. A lot of them speak to what you’re doing in Query AI. Tell me, which quote was your favorite and why?

[Matt Eberhart] I always struggle to pick favorites. There were so many great quotes, and the conversation has been great. So, if I had to pick just one, which you’re forcing me to do, I will pick Ashish’s because I think it ties a lot of it together. That it’s really about having the right data at the right time and the ability to understand it. He uses the term context quite a few times. And it’s the ability to not just have the context, but to present the right context to the AI model, to the LLM, to the agent at the right time so that it can do its job and can produce something that a person can actually understand and act on. All of that is fairly easy to say, very difficult to do, and really goes right at the core of what we’re doing at Query. And I thought that was a great quote.

[David Spark] All right, excellent. Steve, your favorite quote and why.

[Steve Zalewski] I’m going to go with Dan Gorecki. And the reason why is while we’re talking about context and we’re talking about data and velocity and insights and speed, what I don’t want people to do is to become overwrought in trying to solve what is in essence an almost impossible problem because data is constantly changing, evolving, creating. What I liked is it said, while the data is everywhere, does it all add value to your decision-making process? That’s where you want the context is what type of decision are you making and what context do you need? I’d prefer to have less data that is accurate to work off of to make an informed decision. So, my takeaway from this show is while the context and the need for the data is incredibly important now, let’s just make informed decisions, not right decisions, and get the right data where we need it, and we’ll continue to evolve in this journey of having the right data at the right time for an informed decision.

[David Spark] Thank you very much, Steve. Thank you very much, Matt. And huge thanks to Matt’s company, that’s Query AI. How do you find it? You just go to query.ai and then you’re there. Now, Matt, two questions – A, are you hiring over there? And B, any special offer for our audience?

[Matt Eberhart] Yes, we are hiring. We’ll have some more hiring coming. So, keep an eye on us. The best way to do that is to follow us on LinkedIn. It’s Query AI on LinkedIn. We post all kinds of great content out there. And so, please give us a follow there.

[David Spark] And was there any special offer for our audience?

[Matt Eberhart] Yes, so I do have a special offer, and we’ll see how well received it is. So, if you mention CISO Series, for better or worse, I will give you a demo myself.

[David Spark] You get the white glove treatment.

[Matt Eberhart] You do.

[David Spark] All other potential customers get thrown to the gutter. If you mention CISO Series, they walk you through the special entrance.

[Matt Eberhart] There you go.

[David Spark] Go through the kitchen, and you get front row seating.

[Matt Eberhart] The proverbial red carpet, yes. I’ll give you a demo and we do trials. So, it usually takes about 45 minutes to onboard yourself to Query and integrate five or six different connectors. And we will do a special trial offer for mentioning CISO Series.

[David Spark] So, go do it. Take advantage of it. I always say our sponsors do offer some really nice freebie type things. And there’s this fear of, ah, I don’t know why, I want to learn about it. I don’t want to get trapped. Look, something to learn about your environment, it’s always worth it.

[Matt Eberhart] Yeah, absolutely.

[David Spark] It’s always worth it to learn about your own environment, always.

[Matt Eberhart] And when it comes to enterprise security software, I mean, it’s a very simple implementation. We don’t centralize your data. We leave it where it is. It’s read-only API access. So, really, 45 minutes will have you up and using the product.

[David Spark] Thank you very much, Matt. Thank you very much, Steve. Thank you to Query AI and thank you to our audience. I always say it and I do mean it. And that is, we greatly appreciate your contributions and for listening to Defense in Depth.

[Voiceover] We’ve reached the end of Defense in Depth. Make sure to subscribe so you don’t miss yet another hot topic in cybersecurity. This show thrives on your contributions. Please write a review, leave a comment on LinkedIn or on our site, CISOseries.com, where you’ll also see plenty of ways to participate, including recording a question or a comment for the show. If you’re interested in sponsoring the podcast, contact David Spark directly at David at CISOseries.com. Thank you for listening to Defense in Depth.

How Can AI Provide Useful Guidance from Fragmented Security Data?

Huge thanks to our sponsor, Query

Full Transcript

Intro

What are the elements that make a great solution?

How can we go faster?

Sponsor – Query

What’s the optimal approach?

No one said it would be easy.

Closing

ABOUT US

FOLLOW US