The Future of Modern Threat Hunting is Streaming Graph
Towards a new model of threat hunting
The continuous expansion of threat vectors and attack techniques requires a modern threat hunting architecture capable of large scale operations, real-time deep/complex event processing to identify Indicators of Behavior (IoB), and programmable automation to best leverage scarce SOC expertise. Central to the evolution from after-the-fact Indicators of Compromise (IoCs) to IoBs is the need to embrace an event driven architecture.
Many industry initiatives aim to codify the intersection points between data sources, analysis systems, and remediation solutions. These efforts are centered around two characteristics that align with thatDot software in significant ways.
A focus on behavior analysis – The evolution from the use of Indicators of Compromise (IoC) to Indicators of Behavior (IoB) has been driven by the desire to evolve from seeking static definitions of a completed attack (file# or an IP), to an understanding of how an attack happens. This change in perspective creates the opportunity to find attacks earlier, and with more flexibility.
Use of graph data modeling – Representing behavior and relationships is a natural fit for graph data modeling techniques. Graph data structures are terrific at expressing the relationships between entities which simplifies analysis and infrastructure, so much so that STIX Indicators and the Kestrel protocol assumes the use of graph systems for their operation.
Image source: available here.
New Standards Reduce Friction
The cybersecurity industry is active on many fronts defining standards to smooth the frictions that exist between data sources, analysis engines, SIEMs, and automated response systems. A number of these standard include:
STIX™ Indicators – Indicators convey specific observable patterns combined with contextual information intended to represent artifacts and/or behaviors of interest within a cyber security context. [Read more here.]
Kestrel – Kestrel threat hunting language provides an abstraction for threat hunters to focus on the high-value and composable threat hypothesis development instead of specific realization of hypothesis testing with heterogeneous data sources, threat intelligence, and public or proprietary analytics. [Read more here.]
CACAO – defines the schema and taxonomy for collaborative automated course of action operations (CACAO) security playbooks and how these playbooks can be created, documented, and shared in a structured and standardized way across organizational boundaries and technological solutions. [Read more here.]
These standards fit well with thatDot’s approach to a modern threat hunting stack, one powered by thatDot’s Quine streaming graph to detect and instantly alert on known patterns and that uses thatDot Novelty Detector to identify new emerging threat behaviors in real time.
Highly Scalable IoB Pattern Recognition
The evolution from a reactive IoC threat hunting model to a real-time IoB-based approach requires a new set of technical capabilities along with the tools to deliver them. Fortunately, the advent of IoB threat hunting, new standards, and ground-breaking streaming graph technology are all emerging to meet the need.
As shown below, thatDot’s open source Quine streaming graph perfectly aligns with the requirement to ingest multiple data streams and natively process graph data model encoded IoBs to then generate events that invoke predefined remediation actions. The work flow looks as follow:
- Event sources are ingested from any common event stream queue, including Apache Kafka, AWS Kinesis, AWS SQS, or Apache Pulsar/DataStax Astra Streaming.
- STIX-defined IoBs are loaded into Quine using Kestrel graph objects via API, or entered manually, as Quine standing queries.
- Quine standing queries continuously analyze newly arriving events for matches against IoB pattern definitions. Partial matches are identified and stored for any desired period of time to accommodate threat behaviors that occur incrementally over longer time frames.
- Upon a full IoB pattern match, Quine generates a new event that is associated with a pre-defined CACAO Playbook action, for use by SOAR or analysts.
The Problems Quine Solves
Quine solves some hard problems in this role. Let’s take a look at a few of the major points:
Multiple Event Sources
Modern threat detection requires data – lots of data – usually from multiple sources. This brings with it a number of interesting data engineering challenges, especially when we want to materialize that data into a single view and execute analysis in a timely and cost-effective manner.
Combining threat Intelligence, EDR, XDR, and Cloud logs are increasingly common requirements for building a baseline of behavior models against which real-time data is assessed for known and new threats. thatDot’s Quine streaming graph is a new and powerful software tool for resolving many of the data engineering challenges associated with handling volumes of data from multiple sources.
Scale For Costs – Scale graph event processing from 1,000s to 1,000,000s of events per second on commodity cloud VMs, more efficiently than nested joins.
Out-of-Order Data Arrival – Quine standing queries evaluate each newly arriving event as it arrives and stores partial results until completion data arrives.
Entity Resolution – Graph data models are known for leveraging the additional context gained by understanding the relationships between event datum.
Finding Threat Behaviors
IoBs are patterns of behavior expressed as actions taken by users or systems. Identifying the end to end pattern of an IoB across events generated by disparate systems is a perfect alignment with the Quine graph data model.
Quine evaluates every single newly arriving event for partial or full match against defined IoB patterns. This incremental approach to evaluating data is paired with a highly efficient mechanism for persisting partial matches. The result is a threat detection solution that tracks millions or billions of suspect actions until there is a complete pattern match, at which point an event is generated to serve as an alert or to trigger an automated workflow.
Incremental Evaluation Of Events For IoB Patterns Across Event Sources
Image source: Quine Streaming Graph White Paper (PDF)
Automated Responses
CACAO provides a graph-based data model. As such, CACAO implementations should implement protections against graph queries that can potentially consume a significant amount of resources and prevent the implementation from functioning in a normal way.
Identifying Novel New Behaviors
Of course, the most difficult part of threat hunting is identifying new threat vectors as near to the time when they first appear as possible. This is especially difficult since attackers are intentionally working to obscure their illicit behavior in large volumes of events. Systemic approaches that use traditional anomaly detection approaches have largely failed to detect sophisticated attacks without also identifying a significant number of false positives, forcing reliance upon manual human evaluations based on intuition and increasingly scarce security expertise.
thatDot Novelty Detector brings a fresh approach to the problem of detecting illicit behavior. Novelty Detector is a new graph AI technique built on the Quine streaming graph. As such, Novelty Detector natively uses categorical data in events, such as IP addresses, file names, file paths, API call types etc. to fully understand the context of user and system actions. This rich context is used to evaluate behaviors via Information Theory analysis to identify novel new behaviors in real-time, with incredibly low incidence of false positives.
Once a new novel behavior is evaluated, it can then be encoded as a new IoB and fed into an operating Quine streaming graph system for immediate use on newly arriving data, or applied to previous data if desired.
Separately, Quine streaming graph and Novelty Detector software offer unique capabilities for organizations and service providers: real-time processing of categorical data to find known IoB patterns (Quine) and emerging new threat patterns (Novelty Detector).
When combined as a single platform that uses industry standards for IoB definitions and intersystem communications, the result is a comprehensive modern threat hunting and remediation stack.
thatDot Streaming Graph Delivers Scalable Threat Hunting
Quine is available in both open source and enterprise (thatDot Streaming Graph) editions. However, Novelty Detector is available either in the AWS marketplace or under license as part of thatDot Streaming Graph.
Streaming Graph offers large organizations and managed security service providers (MSSPs) both the clustered, resilient version of Quine and Novelty Detector. It is meant for production applications where resilience, query performance, and throughput matter. Resilient clustering includes support for hot spares and distribution across multiple availability zones.
We recently shared reproducible tests demonstrating both scale (thatDot Streaming Graph easily processed one million 4-node graph events/second) and resilience in the face of node failure. You can read about the tests here.
Try It Yourself
If you want to try it on your own, here are some resources to help:
- Download Quine – JAR file | Docker Image | Github
- Check out the Ingest Data into Quine blog series covering everything from ingest from Kafka to ingesting .CSV data
- Password Spraying Attack Detection – this recipe provides an example of detecting brute force attack patterns in authentication logs
Header image adapted from photo by Lianhao Qu on Unsplash.
Related posts
-
Stream Processing World Meets Streaming Graph at Current 2024
The thatDot team had a great time last week at Confluent’s big conference, Current 2024. We talked to a lot of folks about the power of Streaming Graph,…
-
Streaming Graph Get Started
It’s been said that graphs are everywhere. Graph-based data models provide a flexible and intuitive way to represent complex relationships and interconnectedness in data. They are particularly well-suited…
-
Streaming Graph for Real-Time Risk Analysis at Data Connect in Columbus 2024
After more than 25 years in the data management and analysis industry, I had a brand new experience. I attended a technical conference. No, that wasn’t the new…