The Known Security Challenge of the Unknown

thatDot avatar Jack Danahy

Lacking Categorical Data, Enterprises are Vulnerable to Cyber Attacks

Destructive attack campaigns like WannaCry, NotPetya, or even the Mirai DDoS family, succeed because they integrate new techniques or new hardcoded credentials to access and victimize their targets. Once used, though, they rapidly lose much of their sting as they are understood and mitigated. Organizations patch exploitable vulnerabilities, change credentials, and block known-hostile traffic and executables with application firewalls and gateways.  Traditional security is most effective when called upon to identify and block what it can understand.

Unfortunately, profit-motive and system complexity breed a seemingly endless stream of new threats.  Some are simply trivially reconstituted versions of older attacks, like malware, and some are ingeniously constructed, like multi-component credential theft and ransomware campaigns. There will always be latency between the arrival of a new threat and the corresponding protection that will recognize and disrupt it.

To address this gap, security practitioners and vendors have long attempted to identify new threats by first learning what good traffic or artifacts look like, then identifying what’s new and applying some type of logic to decide if that new event or artifact is good or bad.  This has been done with machine learning-based analysis of executable objects to identify malware and through network-based anomaly detection to find hostile behavioral patterns. Both approaches have struggled with a poor signal-to-noise ratio, requiring additional effort and expertise to distill the real threats from the vagaries of a dynamic environment.

These approaches are often used in tandem to minimize the likelihood of false negatives. Resulting security events can be ascribed to one of two detection techniques: Signature Detection or Anomaly Detection

Signature Detection

When a pattern is available in an object or in a series of activities, signatures can be used to describe and then identify what is known. This creates a low number of false positives, but it puts the user on an unending treadmill of effort to research, identify, and upgrade new protection with near continuous updates.

Anomaly Detection

For anomaly detection to work, a baseline needs to be established or learned. After this, new behaviors, connections, users, or services, will be surfaced as concerns for resolution by security analysts. This will create a low number of false negatives but will generate unmanageable quantities of false positives in any dynamic or user-facing environment. Anomalous events happen all the time, making them poor providers of conclusive data.

So, in short, signatures are too specific and time-delayed while anomalies are too general and time-consuming.

Enter the Impact of Novelty Detector

When discovering and describing anomalous events, much of the event context is commonplace. It is the combination of elements that make an event an anomaly. The characteristic of an element that creates an anomalous event is that element’s novelty. This more granular attribute of an event can be analyzed to disambiguate the troubling from the simply unusual.  As an example, think of an anomaly that is created by an unexpected file access. That operation will be characterized by multiple elements, including the user, user IP address, target file, target file system, time of day, network, geography, filetype, user role, and others which together form a behavior context. If the access is anomalous because it is a first-time file access (the filename is novel), that class of detection will generate a flood of false positives because the operation is new, but the context is quite common.

In contrast, consider a case where the characteristic that makes the event anomalous is the user. If multiple events are flagged and a user’s ID is the novel characteristic, this is more concerning. The context of the detection is a pattern of file accesses made unusual because the user doesn’t ordinarily access those files. Add to this a novel geography, time of day, or user/role combination, and that novelty suddenly transforms a first-time access into a likely security event.

Novelty data defines a path to evolving anomaly analysis from high volume, low confidence detections to low volume, high confidence events

Collecting and training on novelty data defines a path to evolving anomaly analysis from high volume, low confidence detections to low volume, high confidence events. It represents a deeper level of modeling that simulates the type of second-order investigation and clarification typically performed by analysts. The savings in time, alert fatigue, and missed events, make novelty analysis a foundational improvement for threat and active attack detection.

To learn more about novelty and its revolutionary impact on threat detection visit the Novelty Detector overview page.