Equipping threat hunters: Advanced analytics and AI – Part 2

If you work in the field of cybersecurity, you cannot ignore the warning sign of any imminent threats. New cyberthreats, adversaries, and hacking tools emerge…

Mari Pospelova profile picture
Mari Pospelova

September 24, 20247 minute read

A computer screen shows code on it, with a woman's face reflected. In the bottom right corner, there is a magnifying glass symbol with a bug inside. Everything except the symbol has a blue overlay.

If you work in the field of cybersecurity, you cannot ignore the warning sign of any imminent threats. New cyberthreats, adversaries, and hacking tools emerge daily, each more advanced than the last. The volume of cybersecurity data is skyrocketing, overwhelming professionals with noise. Meanwhile, the number of defenders is decreasing, while the dark side is growing stronger with support from organized crime and state funding. What could save the world? Superheroes, of course, but behind every successful superhero is a tech sidekick – in the case of cyber threat hunters, a data scientist. 

This is the twelfth post in our ongoing “The Rise of the Threat Hunter” blog series. By now you are well familiar with our superheroes, threat hunters. To learn more about the series and find previous posts check out our series introduction or read last week’s post “Equipping threat hunters: Advanced analytics and AI part 1.” 

Data scientists among threat hunters 

While at first glance, data scientists may seem out of place among threat hunters, their complementary skills can elevate a threat-hunting team from good to outstanding. Cyber threat hunting involves more than just technical skills and following procedures. It requires an analytical, investigative mindset and a creative approach. Interestingly, these are the same skills that are essential for a successful data scientist. This is one of the reasons why data scientists and threat hunters work so well together. 

Data scientists spend their days engaging with data, uncovering hidden patterns and insights that cannot be found through traditional methods. They are professionals with diverse expertise in computer science, mathematics, statistics, and machine learning. Collaboration between threat hunters and data scientists is crucial for effective cyber threat investigations. Threat hunters identify what to look for, while data scientists figure out how to extract those signals from vast and complex data. Effectively stated problems, as well as thoroughly cleaned and prepared data, can provide access to data insights on a whole new level, unreachable through direct searching and querying of the original data. This can make the difference between missing the compromise or catching it early in its tracks before any damage is done. 

Skilled operator or simple button presser 

Cybersecurity tools have changed. Continued advancements have drastically improved both the speed and efficiency by incorporating the latest technological advancements such as machine learning, smart asset discovery, and entity resolution.  However, these advancements we have seen an increase in the sophistication and complexity of stopping threats.  

This in turn has made security tools less intuitive and self-explanatory. And with the increased complexity comes the risk that the already busy threat hunters will go from being skilled operators of advanced tools to simple button pressers. This is not because they don’t want to learn the tool but rather most threat hunters don’t have the time to specialize on an individual tool in their security stack.  

Pairing data scientists and threat hunters in a cyber defence team might just solve this issue. Data scientists not only know a variety of methods to extract information from the data but also deeply understand the weaknesses and common pitfalls of these methods. More importantly they understand the weaknesses of the data that these methods are applied and how to overcome them.  

For example, let’s look at prompt engineering. Grammar, length, tone, and sentence structure can make the difference between getting an insightful answer or not getting an answer at all. Even worse than not getting an answer is getting a wrong answer, also known as a hallucination. Data scientists can work with threat hunters to iteratively design effective prompts using their domain and data knowledge. As a team, threat hunters and data scientists can effectively generate high quality results. 

The entity resolution problem 

Another example of effective teamwork between threat hunters and data scientists is entity resolution, which involves associating all of the events and behaviours of a single entity with it. This problem consists of two parts. First, we need to define what constitutes an entity, including determining the best level of granularity and separation methods. Second, we need to maximize the association between an entity and all its activities recorded across various data sources and entity representations. While this may sound simple, how both problems are solved will directly impact the quality of the results produced by the data. 

Let’s say an organization has a system admin with two accounts. One regular user account is used for day-to-day tasks and the other is their privileged administrative account. Should we join these two accounts under a single entity or keep them separate? From a threat hunter’s perspective, both accounts belong to the same employee, hence it seems natural to join them. However, from the data science and behavioural analysis perspective, these two accounts have different functions, use difference processes, have different permissions, transfer different volumes of data,  and even have different activity patterns. Moreover, when their behaviour is compared to peers, for the best results, the administrative account should be compared to other administrative accounts and the regular user account should be compared to other regular user accounts. Running PowerShell scripts that add or remove users or escalating user privileges is quite normal for an admin, but extremely anomalous for a regular user.  

Integrating data science into the threat hunting process 

At this stage, I hope you are convinced that your threat-hunting team needs data scientists. There are several options to acquire one: you could hire a new data science team member, use external data science consultants or paid services, or nurture the talent among your current cybersecurity team members. Once you have your multidisciplinary team assembled, they can begin by tackling one use case at a time. The diverse expertise within your team will provide an advantage, as ideas for solutions and use cases will come from both threat hunters and data scientists on your cyber defence team. 

In my experience working with our threat hunting team, hands-on threat-hunting professionals commonly bring in specifics from use cases they see out in the field along with the methods that they use to tackle them. Once threat hunters have described the intricacies of the use case, the methods used to detect it, and the limitations of the currently available tools, data scientists can delve into the data. They consider various aspects of the problem and the data at hand, sometimes approaching it as a mathematical or logical problem. This creates a foundation for brainstorming new detection methods, testing and validating them, and ultimately integrating these new solutions permanently into the arsenal of front-line defenders. 

A data science team member can also come up with a new use case based on new algorithms, computational methods, or data sources. When this happens, the data scientists share their findings with the threat hunters and together they brainstorm to determine if the findings have value and how they can improve current solutions. For instance, the data science team could notice that command line entries have similar patterns to natural language and suggest that threat hunters use language-based models to uncover reconnaissance attacks. Through close collaboration with the threat hunters, a new effective method for discovering command line reconnaissance is developed, with the idea being tested and evaluated through multiple iterations. 

Conclusion 

Data scientists help threat-hunting teams work smarter, not harder, and may, when well-integrated and properly involved, help reduce the burden on our threat-hunting superheroes’ shoulders. The ideas bouncing back and forward between all team members in multiple cycles expand naturally improving the coverage of the vast array of cybersecurity use cases. Such flow of ideas should never stop as this race to secure an organization is never over and the defenders’ team is never “done”, but this is the challenge and the beauty of this occupation. 

Learn more about OpenText Cybersecurity 

Ready to enable your threat hunting team with products, services, and training to protect your most valuable and sensitive information? Check out our cybersecurity portfolio for a modern portfolio of complementary security solutions that offer threat hunters and security analysts 360-degree visibility across endpoints and network traffic to proactively identify, triage, and investigate anomalous and malicious behavior. 

Share this post

Share this post to x. Share to linkedin. Mail to
Mari Pospelova avatar image

Mari Pospelova

Maria Pospelova is a Principal Data Scientist, leading a team of data scientists for Interset, the applied AI division for OpenText Cybersecurity. Maria has been “catching bad guys with math” for almost a decade. With profound expertise in applying data science to the cybersecurity domain, she takes an active role in the development and innovation of Interset’s technology, authoring several patents and research papers in both fields.

See all posts

Stay in the loop!

Get our most popular content delivered monthly to your inbox.

Sign up