Awesome

Dorothy2

A malware/botnet analysis framework written in Ruby.

For a perfect view of this document (images and links), open it through the project's code repository.

For any issue, use our Redmine

A wiki page for dorothy2 is under construction. Please take a look at it.

##Introduction

Dorothy2 is a framework created for suspicious binary analysis. Its main strengths are a very flexible modular environment, and an interactive investigation framework with a particular care of the network analysis. Additionally, it is able to recognise new spawned processes by comparing them with a previously created baseline. Static binary analysis and an improved system behaviour analysis will be shortly introduced in the next versions. Dorothy2 analyses binaries by the use of pre-configured analysis profiles. An analysis profile is composed by the following elements:

A certain sandbox OS type
A certain sandbox OS version
A certain sandbox OS language
A fixed analysis timeout (how long to wait before reverting the VM)
The number of screenshots requested (and the delay between them)
A list of the supported extensions, and how the guest OS should execute them

The use of profiles gives the researcher the possibility to run analysis on a set of binaries by using different environments. As it is known, some malwares are configured to run only in specific environment. A CSIRT, might use them to test suspicious malwares only against an environment that reflects the one of its customers. Sources can also be configured to be automatically analysed by certain profiles (e.g. use Profile_Windows_30sc for all the binaries retrieved by Kippo_source).

Dorothy2 is a continuation of my Bachelor degree's final project (Dorothy: inside the Storm ) that I presented on Feb 2009. More information about the whole project can be found on the Italian Honeyproject website.

The framework is mainly composed by five modules that can be even executed separately. The following picture gives an overview of the current modules and how they are connected each others.

The Binary Fetcher Module (BFM)

In charge of retrieving the binaries from the configured sources. Currently a “binary source” can be system folder, an email-box, or a host reachable by ssh. Once the binaries have been retrieved, the BFM will populate the analysis queue.

The Dorothy analysis engine

In charge of analysing the queue by executing the scheduled binaries into a sandbox, and then storing the generated network traffic and its screenshots into the analysis folder (moreover populating Dorothive with the basic information of the file, and CouchDB with the network pcaps).

The (network) Data Extraction Module (old dparser)

In charge of dissecting the pcaps file, and storing the most relevant information (flows data, GeoIP info, etc) into Dorothive. In addition, it extracts all the files downloaded by the sandbox through HTTP/HTTPS and store them into the binary file's analysis folder.

The (dummy) Webgui

A dummy Sinatra application which gives an interactive overview on all the acquired data. WARNING: this module is intended to be executed in an controlled environment. The author strongly discourage to expose it on the Internet.

The Java Dorothy Drone (Mainly coded by Patrizia Martemucci and Domenico Chiarito, but not part of this gem and not publicly available.)

Our botnet infiltration module, refers to this ppt presentation for an overview.

The first four modules are publicly released under GPL 3 license as tribute to the the Honeynet Project Alliance. All the information generated by the framework - i.e. binary info, timestamps, dissected network analysis - are stored into a postgres DB (Dorothive) in order to be used for further analysis. A no-SQL database (CouchDB) is also used to mass store all the traffic dumps thanks to the pcapr/xtractr technology.

I started to code this project in late 2009 while learning Ruby at the same time. Since then, I´ve been changing/improving it as long as my Ruby coding skills were improving. Because of that, you may find some parts of code not-really-tidy :)

##Requirements

WARNING: The current version of Dorothy only utilises VMWare ESX5 as its Virtual Sandbox Module (VSM). Thus, the free version of ESXi is not supported due to its limitations in using the vSphere 5 API. However, the overall framework could be easily customised in order to use another virtualization engine. Dorothy2 is very modular,and any customisation or modification is very welcome.

Dorothy needs the following software (not expressly in the same host) in order to be executed:

VMWare ESX >= 5.0 (tip: if you download ESXi, you can evaluate ESX for 30 days)
Ruby 1.9.3
Postgres >= 9.0
At least one Windows virtual machine
One unix-like machine dedicated to the Network Analysis Engine(NAM) (tcpdump/ssh needed)
pcapr-local (only used by doroParser)
MaxMind libraries (only used by doroParser)

Regarding the Operating System

Dorothy has been designed to run on any *nix system. So far it was successfully tested on OSX and Linux.
The virtual machines used as sandboxes are meant to be Windows based (successfully tested on XP)
Only pcapr-local strictly requires Linux, if you want to use a Mac for executing this gem (like I do), install it into the NAM (as this guide suggests)

Installation

It is recommended to follow this step2step process:

Set your ESX environment

Sample setup

Install the required software
Install Dorothy and libmagic libraries
Start Dorothy, and configure it
Use Dorothy

1. Set your ESX environment

Basic configuration (ssh)

From vSphere:

     Configuration->Security Profile->Services->Proprieties->SSH->Options->Start and Stop with host->Start->OK

Configure two separate virtual networks, one dedicated exclusively to the SandBoxes (See Sample Setups)
Configure the Windows VMs used for sandboxing

Disable Windows firewall (preferred)
VMWare Tools must be installed in the Windows guest system.
Configure a static IP
After configuring everything on the Guest OS, create a snapshot of the sandbox VM from vSphere console. Dorothy will use it when reverting the VM after a binary execution.

From vSphere, create a unix VM dedicated to the NAM

Install tcpdump and sudo
```
  #apt-get install tcpdump sudo
```
Create a dedicated user for dorothy (e.g. "dorothy")
```
  #useradd dorothy
```
Create a directory inside the dorothy user's home where storing the network dumps
```
  #su dorothy
  $mkdir /home/dorothy/pcaps
```

Add dorothy's user permission to execute/kill tcpdump to the sudoers file:

  #visudo
  add the following line:
  dorothy  ALL = NOPASSWD: /usr/sbin/tcpdump, /bin/kill, /usr/bin/killall

If you want to install pcapr on this machine (if you want to use dorohy from a MacOSX machine, you have to do it) install also these packages (refer to this blog post for a detailed howto). However, if you are installing Dorothy into a Linux machine, I recommended you to install pcapr on the same machine where the Dorothy gem was installed.
```
  #apt-get install ruby1.9.3 rubygems  tshark zip couchdb
```
Start the couchdb server
```
  #/etc/init.d/couchdb start
```
Install pcapr-local
```
  #gem install pcapr-local
```
Start pcapr-local by using the dorothy's system account and configure it. When prompted, insert the folder path used to store the network dumps
```
  $startpcapr
  ....
  Which directory would you like to scan for indexable pcaps? [/root/pcapr.Local/pcaps]
  /home/dorothy/pcaps
```

In addition, remember to allow pcapr to run on all the interfaces

    What IP address should pcapr.Local run on? Use 0.0.0.0 to listen on all interfaces [127.0.0.1]
    0.0.0.0

If everything went fine, you should be able to browse to
```
  http//{ip-used-by-NAM}:8000
```

5 From vSphere, configure the NIC on the virtual machine that will be used for the network sniffing purpose (NAM). >The vSwitch where the vNIC resides must allow the promisc mode, to enable it from vSphere:

   >Configuration->Networking->Proprieties on the vistualSwitch used for the analysis->Double click on the virtual network used for the analysis->Securiry->Tick "Promiscuous Mode", then select "Accept" from the list menu.

WARNING: If you are virtualizing ESX from a Linux host machine, remember to give the right privileges to the network interface used by VM Player / Workstation in order to allow promiscuous mode:

   > chmod a+rw /dev/vmnet0

* Sample Setups

Basic setup

In the following example, the Dorothy gem is installed in the same host where Dorothive (the DB) resides. This setup is strongly recommended

>![dorothy.basicsetup](http://www.honeynet.it/wp-content/uploads/Dorothy-Basic.jpg)

2. Advanced setup

This setup is recommended if Dorothy is going to be installed in a Corporate environment. By leveraging a private VPN, all the sandbox traffics exits from the Corporate network with an external IP addresses.

2. Install the required software

Install postgres
```
 $sudo apt-get install postgresql-9.1
```

    http://www.postgresql.org/download/

2. Configure a dedicated postgres user for Dorothy (or use the default postgres user instead, up to you :)

Note: If you want to use Postgres "as is", and then configure Dorothy to use "postgres" default the user, configure a password for this user at least (by default it comes with no password)

Install the following packages

 $sudo apt-get install ruby1.9.3 rubygems postgresql-server-dev-9.1 libxml2-dev  libxslt1-dev libmagic-dev

For OSX users: all the above software are available through mac ports. A tip for libmagic: use brew instead:

    $ brew install libmagic

    $ brew link libmagic

In case you want to install pcapr here do this as well:

    $sudo apt-get install tshark zip couchdb

3. Install Dorothy gem

*Install Dorothy gem

    $ sudo gem install dorothy2

4. Start Dorothy, and configure it!

Install MaxMind libraries
- GeoLiteCity
- GeoLite ASN
- Copy GeoLiteCity.dat and GeoIPASNum.dat into Dorothy's etc/geo/ folder
Start Dorothy
```
 $ dorothy_start -v
```

The following message should appear

    [WARNING] It seems that the Dorothy configuration file is not present,
    please answer to the following question in order to create it now.

2. Follow the instruction to configure * The environment variables (db, esx server, etc) * The Dorothy sources (where to get new binaries) * The ESX Virtual machines used for the analysis

The first time you execute Dorothy, it will ask you to fill those information in order to create the required configuration files into the etc/ folder. However, you are free to modify/create such files directly - configuration example files can be found there too. Finally, check out the file extensions.yml within the /etc folder: it instructs Dorothy's sandboxes about how to process the binaries to analize.

###5. Use Dorothy

Copy a .exe or .bat file into $yourdorothyhome/opt/bins/manual/
Execute dorothy with the malwarefolder source type (if you left the default one)

$ dorothy_start -v -s malwarefolder

Usage

Dorothy usage:

Usage:
dorothy2 [options]
where [options] are:
        --Version, -V:   Print the current version.
        --verbose, -v:   Enable verbose mode
       --infoflow, -i:   Print the analysis flow
   --baseline, -b <s>:   Create a new process baseline
     --source, -s <s>:   Choose a source (from the ones defined in etc/sources.yml)
   --CreateSource, -C:   Create new source file
     --daemon, -d <s>:   (start|stop) Execute/kill the selected module (-W, -B, -A) in backround. If no modules are specified, it will exec/kill all of them.
          --debug, -e:   Add extensive log trails
         --manual, -m:   Start everything, copy the file, and wait for me.
  --SandboxUpdate, -S:   Update Dorothive with the new Sandbox file
	--DorothiveInit, -D <s>:   (RE)Install the Dorothy Database (Dorothive)
	--queue, -q:   Show the analysis queue
	--Analyser, -A:   Execute only the Analyser Module (will analalyse only the current queue)
	--BFM, -B:   Execute only the Binary Fetcher Module (BFM)
	--DEM, -E:   Execute only the network Data Extation Module (DEM) aka doroParser
	--WebGUI, -W:   Execute the WebGUI Module (WGUI)
	--help, -h:   Show this message

Example

$dorothy2 -v -d start
This will execute all the modules in background

The first time dorothy2 is executed it will drive the user into configuring the analysis environment, more specifically the user will get through the following configuration steps:

Configuring the general env variables ($home/.dorothy.yml)
Configuring the BFM sources ($dorothyhome/etc/sources.yml)
Configuring the sandboxes ($dorothyhome/etc/sandboxes.yml)
Configuring the analysis profiles (auto-filled) ($dorothyhome/etc/profiles.yml)

Once the configuration step will be performed, the user will be always able to edit the configuration files at anytime.

###6. Debugging problems

I do recognise that setting up Dorothy is not the easiest task of the world. By considering that the whole framework consists in the union of several 3rd pats, it is very likely that one of them will fail during the process. Below there are some tips about how understand the root-cause of your crash.

Set the verbose flag (-v) while executing dorothy, or the —debug flag for additional debugging trails.

$dorothy_start -v -d -s malwarefolder

If any error occours, go to our Redmine and raise a bug-ticket!
Write at dorothy2 at googlegroups.com

Acknowledgements

Thanks to all the people who have contributed in making the Dorothy2 project up&running:

Marco C. (Research)
Davide C. (Dorothive)
Federico S. - Calogero L. (Infrastructure)
Domenico C. - Patrizia P. (Dorothive/JDrone)
All the graduating students from UniMI who have contributed.
Sabrina P. (our students "headhunter" :)

Contributing

Fork it
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Every contribution is more than welcome! For any help, please don't hesitate in contacting us at : info at honeynet.it or through our ML: dorothy2 at googlegroups.com

License

Dorothy is copyrighted by Marco Riccardi and is licensed under the following GNU General Public License version 3.

                GNU GENERAL PUBLIC LICENSE
                   Version 3, 29 June 2007