Home

Awesome

First prototype. Do not use in production!

Ghidra Patch Diff Correlator Project

This project tries to provide additional Ghidra Version Tracking Correlators suitable for patch diffing.

How do I install it?

In Ghidra: File -> Install Extensions hit the top right green + icon; then select the ghidra_<VERSION>_PatchDiffCorrelator.zip (that you either build from source with the GhidraDev plugin or downloaded pre-build from the releases section; please make sure VERSION matches your Ghidra version!)

Then restart Ghidra.

How to use?

A simple introduction video:

Youtube video introducing the PatchDiffCorrelator Project

Simple workflow:

  1. Run the Automatic Version Tracking Command.
  2. Run a Bulk * Match Correlator with Only match accepted matches select. This will produce a scoring for your accepted matches for similarity of the functions.

Advanced workflow:

While the Automatic Version Tracking Command find very good matches by running the included Correlators in their defined ordered and automagically accepting good matching, it takes more time than only running the Correlators you need to get your matches. This can be done via (but may vary depending on binary):

  1. Run the Exact Symbols Name Match Correlator if there are symbols.
  2. Run the Exact Function * Match Correlators.
  3. Accept all matched functions.
  4. Accept suitable Implied Matches
  5. Run some Reference Correlators.
  6. Accept matches.
  7. Repeat "conventional" matching until the function you are after has been accepted.
  8. Run a Bulk * Match Correlator with Only match accepted matches select. This will produce a scoring for your accepted matches for similarity of the functions.

Hints

How does it work?

This adds additional Program Correlators to Ghidra. These are - unlike the Correlators that ship with Ghidra - able to produce Matches with a Similarity Score below 1.00. This means these correlators give an estimate how similar functions are to one another instead of providing perfect matching as the included correlators.

This indicator on similarity is need to find patches in functions.

"Bulk" Correlators

Bulk Instructions Match

The Bulk Instructions Match Correlator will make an unordered bulk list of Instructions occurring in a function.

Let's say we have the function:

PUSH       EBP
MOV        EBP,ESP
SUB        ESP,0x8
MOV        ESP,EBP
POP        EBP
RET

Then the Correlator would "bulk" this to the following list of features:

If we now have a function:

PUSH       EBP
MOV        EBP,ESP
SUB        ESP,0x42
MOV        ESP,EBP
POP        EBP
RET

With features:

It would match 5 out of 6 features of the earlier function.

The matching is unordered - hence the notion of "bulk".

So a function of (warning: doesn't make sense):

SUB        ESP,0x8
POP        EBP
MOV        EBP,ESP
PUSH       EBP
MOV        ESP,EBP
RET

Would still match 6 of 6 with the original function, because of the unordered bulk comparison logic.

Bulk Mnemonics Match

The Bulk Mnemonics Match Correlator only adds the instruction mnemonics to the feature bundle for matching.

If you have the function:

PUSH       EBP
MOV        EBP,ESP
SUB        ESP,0x8
MOV        ESP,EBP
POP        EBP
RET

Then the Correlator would "bulk" this to the following list of features:

If we now have a function:

PUSH       EBP
MOV        EBP,ESP
SUB        ESP,0x42
MOV        ESP,EBP
POP        EBP
RET

With features:

would match 6 of 6.

Same unordered remarks as in the [Bulk Instructions Match] Correlator apply.

Bulk Basic Block Mnemonics Match

The Bulk Basic Block Mnemonics Match Correlator first converts the mnemonics of each basic block into a list. That list is sorted and hashes (so the order of the mnemonics within the basic block don't matter). Then these basic block hashes are compared between functions in an unordered bulk comparison.

Options

There are several options:

Coloring Correlators

These correlators color address ranges in the Source and Destination Programs that are different.

Abandoned: Coloring Basic Block Mnemonics

State: This is a first work in progess prototype to see whether a correlator is technically able to execute setBackgroundColor() on the Source and Destination Programs.

This colors basic blocks that are either new or deleted or have a different Mnemonic "Bulk" (see [Bulk Mnemonics Match] for a concept of "Bulk").

Current issues:

This is implemented as a script.

Other Correlators

Scripts

You can grab just the scripts. They are in PatchDiffCorrelator/ghidra_scripts/.

FunctionDiffColorizer.java

  1. Open Source Program.
  2. In Source Program select the Function you want to compare against the Destination Program (and colorize in the Destination Program).
  3. Run FunctionDiffColorizer.java
    1. Select Destination Program.
    2. Select Destination Function.
  4. The changes of the Destination Function in the Destination Program are now colored.

Issues:

How to develop with this?

git clone https://github.com/threatrack/ghidra-patchdiff-correlator
# build `ghidra_<VERSION>_PatchDiffCorrelator.zip` from command line
cd ghidra-patchdiff-correlator/PatchDiffCorrelator/
gradle -PGHIDRA_INSTALL_DIR=$GHIDRA_HOME

Eclipse

I have no idea how Eclipse or Java or any of that works so this may be bullshit - but it works!

Just importing the Gradle project doesn't work (unless you got the exact same Ghidra version (and paths?) that were used when developing this.

The easiest way so far (that I found) to import the code in Eclipse for development is:

  1. You need to have Eclipse setup with the GhidraDev plugin.
  2. git clone https://github.com/threatrack/ghidra-patchdiff-correlator; cd ghidra-patchdiff-correlator; mv PatchDiffCorrelator PatchDiffCorrelator.bak
  3. In Eclipse: GhidraDev -> New -> Ghidra Module Project...; Project name: PatchDiffCorrelator, Project root directory: $GITHUB/threatrack/ghidra-patchdiff-correlator, "NEXT"; Deselect all module templates, "FINISH"
  4. Project -> Import; General -> Filesystem; From directory: $GITHUB/threatrack/ghidra-patchdiff-correlator/PatchDiffCorrelator.bak

TODO

from ghidra.app.plugin.core.instructionsearch import InstructionSearchApi
from ghidra.app.plugin.core.instructionsearch.model import MaskSettings
InstructionSearchApi().getBinarySearchString(currentProgram,currentSelection.getFirstRange(),MaskSettings(False,False,True))