Threat Hunting with Sysmon and Graphs
In this post we were going to try to explain how sysmon threat hunting is performed and how we can improve it using a graph database.
If you still don’t know about the sysmon tool, I recommend familiarizing yourself with it, before continuing.
Why graphs?
In a typical web application, logs are created in sequential order one after another in the same thread of execution. We only need information about the exceptions generated by an error in the form of an exception. With that exception, we can obtain all the records generated in the same session in order to locate the error.
This type of logs does not have a hierarchical structure, a record is created after the previous one. For that reason, systems like Elastic Search are the ideal tools to manage logs.
In contrast, sysmon logs follow a tree pattern. One process starts another, this creates one or more, and so on. If we try to follow the behavior of a process and its descendants, we can do it with elastic, but it becomes more complicated with each new hierarchical level.
This may be easier to understand with an example:
A typical malware infection vector is a Word document that runs a macro and opens a cmd process.
In terms of human language we will say:
Give me all the word processes whose direct child is a cmd.
In elastic we will do something like:
ParentImage:word.exe AND Image:cmd.exe
If the query does not give us any results, it means that the malicious execution can have 3 or more levels.
This involves a query to get all the ProcessCreation events with a parent “word.exe” and, a new query for each result getting all the processes “cmd.exe” executed by the one with the same ProcessGUID as the result: ProcessGUID:xxxx AND Image:cmd.exe
. If there are more intermediate processes, it becomes increasingly difficult.
Instead, with ArangoDB we will use only one query:
FOR startProcess IN SysmonProcess
FILTER startProcess.Image == "word.exe"
FOR endProcess IN 1..2 OUTBOUND startProcess CreateNewProcess
FILTER endProcess.Image == "cmd.exe"
RETURN {'start' : startProcess,'end' : endProcess}
The result has all the word processes that have a cmd.exe child or grandchild.
Query explanation
FOR startProcess IN SysmonProcess
: Search the collection SysmonProcess and name each result as startProcess.FILTER startProcess.Image == "word.exe"
: The process image must be “word.exe”.FOR endProcess IN 1..2 OUTBOUND startProcess CreateNewProcess
: This one is a bit tricky.1..2
means at most 2 hierarchical levels below the startProcess,OUTBOUND startProcess CreateNewProcess
means all output connections of type CreateNewProcess of each startProcess.FILTER endProcess.Image == "cmd.exe"
: The child or grandchild process must be a “cmd.exe”.RETURN {'start' : startProcess,'end' : endProcess}
: Create a new object and return both the start and end nodes.
Final Result
The script can be found in: https://github.com/SecSamDev/sysmon-arangodb