



symbexcel is a symbolic deobfuscator for XL4 macros, currently developed by Nicola Ruaro and Fabio Pagani.

Among a number of other things, symbexcel:

This tool draws some concepts from angr, and is based on the excellent XLMMacroDeobfuscator by DissectMalware. Big kudos to him!

Quick Start

  1. Download symbexcel:
    git clone https://github.com/ucsb-seclab/symbexcel && cd symbexcel
  1. Create a virtual environment (recommended but not required)
    mkvirtualenv symbexcel
    workon symbexcel
  1. Install symbexcel and its dependencies:
    pip install -e .
  1. Start the analysis of a malicious XL4 sample:
    python run.py --file /path/to/malicious/excel.xls --iocs


$ python run.py -h
    usage: run.py [-h] -f FILE [-d] [--iocs] [--breakpoints BREAKPOINTS [BREAKPOINTS ...]] [--checkpoint CHECKPOINT] [--restore RESTORE] [-i] [--cfg] [-t TIMEOUT] [--com] [--nocache]

    Required arguments:
      -f FILE, --file FILE  Path of the malicious sample

    Optional arguments:
      -d, --debug           Enable debug output
      --iocs                Print Indicators of Compromise (IOCs)
      --breakpoints BREAKPOINTS [BREAKPOINTS ...]
                            Set a breakpoint at a specific instruction count
      --checkpoint CHECKPOINT
                            Create a checkpoint at a specific instruction count
      --restore RESTORE     Restore a checkpoint
      -i, --interactive     Drop an IPython shell after the execution
      --cfg                 Save the CFG to /tmp/<sample name>.dot
      -t TIMEOUT, --timeout TIMEOUT
                            Timeout value

    COM specific arguments:
      --com                 Use COM server to process a sample
      --nocache             Force the COM server to process the sample

symbexcel COM Server

symbexcel can either use xlrd2 or Office VBA to parse and extract the content of Excel 4 macrosheets. The VBA API are exposed through a COM server, and interactions from Python code are possible using the pywin32 package.

You can find all the information on how to setup the symbexcel COM server in the symbexcel-server repository. Once the server is up and running:

  1. Add the server IP address in the HOST variable of symbexcel/excel_wrapper/com_config.env.

  2. Add the option --com to the command line of symbexcel.

symbexcel as a Library

You can use also use this project as a Python library (import symbexcel) in your own projects. You can find some good examples for this in the tests folder. Using this project as a library will allow your code to single-step (or n-step) the simulation manager, use the find argument in SimulationManager.run() to specify a search function, etc.

from symbexcel import SimulationManager
from symbexcel.excel_wrapper import parse_excel_doc

excel_doc = parse_excel_doc('tests/bins/test_symbolic.xls')

simgr = SimulationManager(excel_doc)

simgr.run(find=lambda s: '=ALERT' in s.formula)



You can use the Dockerfile and docker-compose.yml from this repo to create a docker container and run the create_clusters script on a set of malware samples. The folder specified in the input and output environment variables will be mounted as /input and /output in the container. You can pass any arguments for the create_clusters script in the args environment variable.

input=/data/xl4_dataset/ output=/data/symbexcel/docker_clustering args="--input /input --output /output --jobs 96 --timeout 1200 --debug --logfile" docker-compose up &> /data/symbexcel/docker_clustering_log &


After installing symbexcel, you can run all tests with cd tests && pytest. Alternatively, you can manually execute any test, i.e. cd tests && python test_file_formats.py. Creating new tests should be straightforward by looking at the existing test routines.


There's a repository from Lastline at https://github.com/Lastline-Inc/xl4samples with some public malicious samples. Download and run them at your own risk!