Using AI for real-time attack simulation at scale (part 1)

Snode works with Geekulcha across Africa for cybersecurity skills development.

Recently, I was the keynote speaker at their NEMISA DATATHON. A few students requested a blog entry on how Snode transitions from initial ideas to a prototype.

Seems like a great place to start,... so welcome to the blog! :)

This blog series will illustrate the process from concept; to design; to prototype.

Part 1 - Ideation: the journey from problem statement to initial idea generation.

Part 2 - Design: an approach to research, experimentation and solution design.

Part 3 - Prototype: Using hypothesis testing to develop and evolve a prototype.

Part 1 - Ideation

At Cogitait Colloquium I presented a novel approach to automated, real-time attack simulation modelling; leveraging AI (artificial intelligence) to improve output fidelity.

The idea rapidly evolved into a workable prototype; and now is planned for (beta) release to select Snode Guardian partners. Hence, this project is a great example.

Problem - too much data with no actionable intelligence

Clients were overwhelmed with vulnerability and threat data, not knowing which tactical responses or remedial activities to prioritise based on the risk to business.

Snode Guardian's visualisation showing the signal-to-noise ratio for vulnerability

Reversing - take a desired outcome and work backwards

A simple but effective approach to problem solving. We selected attack simulation modelling for risk-based prioritisation of response activities and remedial actions.

Traditionally, there are 3 common types of simulations:

Modelling - a well-known example is threat modelling.
Emulation - such as, lab emulation for malware analysis.
Physical - such as a red team assessment/ penetration test.

Physical simulation uses real "live" systems and therefore provides the highest fidelity output possible. However, simulating real attacks on live systems can have severe consequences.

Emulation allows us to build a mirror of live environments for testing and evaluation. However, building an exact duplicate lab environment in today's hyper-connected world is either not possible or too expensive.

Simulation modelling require less resource, making it cost-effective and easy to scale. However, as a desktop analyst exercise, it is generally considered the lowest fidelity of the 3 approaches.

The solution must solve the following challenges/ overcome these limitations:

How do we increase modelling fidelity (to better physical) and automate multiple models in real-time at scale?
How do we run millions of simulations in real-time building attack paths using threat, vulnerability and asset data?
How can this be efficiently and effectively designed and implemented to improve the client's security posture?

Research question - laser focused with clear objectives

Which vulnerable assets are the most likely to be targeted by active threats to achieve a successful compromise resulting in a significant risk being realised? (shew!)

System design - answer the question, solve the problem

The following modelling techniques can correlate inputs to accurately assess risk:

Attack Trees - assesses security controls based on common attack vectors.
Attack Paths - identifies current vulnerabilities exploitable by threat actors.
Attack Graphing - mapping all avenues leading to a successful compromise.

I'm a strong believer that before you automate you must perfect the process. For systems design, I always start with an analysis of the current manual procedure.

Thereafter I design an automated system using the manual process as a the base.

High-level process overview

The attack graphing happens at such speed that it is not feasible to create a real-time visualisation dashboard. However, we used visualisation to log and monitor the design, implementation and effectiveness. The graph below represents the first output that Snode's AI-ASM generated (we used Nessus, Nmap, Graphviz and MLpack).

The first attack graph clearly showing the most likely attack chains

This is an extract taken from the original model diagram, generated with Graphviz:

Extract of the original Graphviz model diagram using AI log data.

Note: Before handing over to engineering, we used Nmap outputs to classify assets (with Bayes) and identify weaknesses (with custom NSE scripts).

The output was interesting, since it got a few things right and a few things wrong. Firstly, it incorrectly identified a point-to-point VPN (virtual private network) as a critical weakness since it did not enforce MFA (multi-factor authentication) and would be prone to brute-force attacks or unauthorised access using compromised credentials. That said, the rest of the chain was perfectly aligned to what we would consider critical if found during a penetration test or vulnerability assessment.

Looks like we are off to a great start,... subscribe to be notified when Part 2 is out.

Disclaimer - if I got something wrong,...

Tooling References

Tools used to perform the proof-of-concept development, an all-star line-up.

Nessus - the industry standard, in my opinion - but, OpenVAS will also work:

Graphviz - the de facto tool of the visualisation trade (I didn't know this either):

Nmap - Neo's port scanner of choice,... nuff said!

MLPack - a contentious choice - it was the first thing the Snode team dropped:

mlpack - Home

Home

If - like me - you've never used GraphViz, below is the basic DOT syntax used:

digraph {
node [ shape=square ];
attack1 -> attack2 -> attack4;
attack1 -> attack3 -> attack5;
attack3 -> attack6 -> attack7;}