Awesome
Malware variants in practice: An approach using graph similarity.
Introduction
This repository supplies additional material for the Malware Similarity paper.
Goals
This work is aimed to:
- Study malware similarity techniques and their limitations.
- Provide some insights on how we could overcome some challenges.
Authors
This work was developed by Marcus Botacin, under supervision of Prof. Dr. Paulo Lício de Geus and Prof. Dr. André Ricardo Abed Grégio.
Data Extraction
The functions here mentioned were obtained from dynamic, transparent traces obtained using our BranchMonitor solution.
Similarity Issues
We tackled the similarity matching problem from two perspectives: i) The used features, and ii) The used matching metrics.
Features
In particular, we are interested on approaches which make use of function as feature, as shown below:
LdrGetProcedureAddress -> LdrLoadDll
LdrGetDllHandle -> LdrLoadDll
NtOpenMutant -> ZwMapViewOfSection
NtCreateMutant -> ZwMapViewOfSection
Fail cases
This kind of approach presents a drawback: Same-behavior function replacement, as shown on the figures below:
Function-Based 1 | Function-Based 2 |
---|---|
Despite having the same behavior, these samples would have been classified as non-similar by a function-based approach.
Our Proposed Approach
As a solution for this case, we have adopted a behavior-based approach. This way, the above samples would be considered as similar, as shown below:
Function-Based 1 | Function-Based 2 | Our Approach |
---|---|---|
Similarity Metrics
The usual metric for similarity measurement is the following:
In this metric, the score will be minimum (0.0) when the inputs are totally distinct, and maximum (1.0) when the inputs are exactly the same.
Fail cases
Using this metrics also presents a drawback: When a sample is embbed inside another, as in the example shown below:
Original Sample | Embedded Sample |
---|---|
In this example, the similarity score is 50%, despite the fact the sample 1 is completely embedded on sample 2. This way, we need to find a similarity metric which could provide more information about the similarity quality.
Our proposal: Using another metric
This way, our proposal is to adopt the following metric:
In this metric, the similarity will be maximum not only when the two samples are equal but also when one is inside another, as desired.
Repository Organization
The repository is organized as follows:
- Classes : Behavior classes associated to DLL functions.
- Examples: Graphs examplifying the aforementioned approaches.
- Behaviors: Behavior-based graphs.
- Functions: Function-based graphs.
- Code: Python scripts to handle graphs and trace data.
- Function.to.behavior: Given a function, return its behavior class.
- Generate.graph: Given an edge list, draw the graph.
- Graph.Match: Given two edge lists, compare the resulting graphs.
- Data: Data used on our experiments, so you can reproduce it.
- Functions: Function traces for selected samples.
- Results: Graph similarity results for selected samples.
- Papers: Research written material.
Examples
The graphs below exemplify the differences between the original approach and our one.
Function-Based | Behavior-Based |
---|---|
Cluster results
An important task empowered by our approach is sample clustering. The figures below show the clustering scores for the following datasets: Mimail, Klez, and a mix of them.
We can notice small thresholds are not able to properly cluster the mix dataset, which is achieved for thresholds higher than 80%. In addition, these thresholds are also able to provide a good clustering result for the same-family datasets.
Publication
Thio work was published at SBSEG 2019.