Home

Awesome

EVM CFG

A fast and accurate CFG generator for EVM bytecode using symbolic stack analysis. View extra graph visuals below.

<p align="center"> <img width="60%" src="examples/basic_code.svg"> <img width="25%" src="examples/basic.svg"> </p>

Installation

First, make sure rust is installed:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Then:

git clone https://github.com/plotchy/evm-cfg
cd evm-cfg/
cargo install --path . --locked
evm-cfg <PATH_OR_BYTECODE> -o cfg.dot --open

For working with .dot files, I recommend using the below VSCode extension as it allows powerful graph searching and filtering:

Name: Graphviz Interactive Preview
Id: tintinweb.graphviz-interactive-preview
Description: Graphviz (dot) Interactive Preview
Version: 0.3.5
Publisher: tintinweb
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=tintinweb.graphviz-interactive-preview

For minimal dependencies, you can convert the .dot file to a .svg picture for viewing:

evm-cfg ./examples/weth9.evm -o ./examples/weth9.svg --open

# which is equivalent to:
evm-cfg ./examples/weth9.evm -o ./examples/cfg_weth.dot;
dot -Tsvg ./examples/cfg_weth.dot -o examples/weth9.svg;
open examples/weth9.svg;

To generate bytecode from a given solidity file, you can use the following:

solc <path_to_file> --bin-runtime --no-cbor-metadata --optimize --via-ir

How It Works In Phases:

Disassembly

Using static analysis, we can split the bytecode up into a structure representable by a graph. The following is done before executing any opcodes through a vm:

Form Basic Edges

Jumps and Jumpis with direct push values are connected to their jumpdests. False sides of Jumpis are connected to the next instruction.

Prune the CFG

Any nodes without incoming edges that also don't begin with a JUMPDEST are impossible to enter, and are removed. This is done to reduce clutter

Symbolic Stack & Traversing

Essentially, the only goal left is to connect more edges when jumps are "indirect". Depending on the path the traverser used when coming into the instruction block, it may jump to separate locations. (ie: one traverser may jump to a push it carried over from block A, while another traverser may jump to a push it carried over from block B)

Finding these indirect jumps is difficult. Normally, DFS traversers can prevent revisiting previously seen nodes by using a set, but in our case, revisiting a node may be required to carry over a separate push value that jumps to a new location. When we consider allowing revisiting nodes loops immediately become a problem.

The following methods are used to prevent loops from going infinite:

This method is definitely not bulletproof, and there are cases where it will fail to quit out of loops. For this implementation, I only track stack values to 128 depth (max is 1024), and only up to 10 values at once. Both of these metrics are easily adjusted, but I have not ran into any cases where it was necessary to increase them.

Example Outputs

First, some timings. This tool is fast!

ContractTime to Generate CFG
April CTF0.620 ms
WETH90.778 ms
Fun Reversing Challenge1.64 ms
UniswapV2Router10.85 ms

Here are some examples of the graph visualizations:

WETH9 on mainnet:

<img width="100%" src="examples/weth9.svg">

Fun Reversing Challenge from paradigmctf2022:

<img width="100%" src="examples/fun_reversing_challenge.svg">

April by @brockelmore:

<img width="100%" src="examples/april.svg">

Contributing

Before opening pull requests consider formatting and linting your code according to:

cargo fmt -- --check
cargo +nightly clippy --all --all-features -- -D warnings