Home

Awesome

YARA for IDA

Unofficial YARA IDA Pro plugin, along with an unparalleled crypto/hash/compression rule set based on
Luigi Auriemma's signsrch signatures.
And as a general upgraded replacement for my deprecated IDA Signsrch plugin.

Installation

Copy yara4ida.dll, yara4ida64.dll and the yara4ida_rules folder to your IDA plugins directory.

The default IDA hot key is "Ctrl-Y", but can be configured to another via your IDA "plugins.cfg" config file.

Requires IDA Pro version 8'ish.

Using

Invoke the plugin via hotkey or from the IDA Edit/Plugin menu -> "Yara4Ida".

dialog screenshot

Options

1) Place comments: Automatically place match comments.
Example "#YARA" placed comments output:
comments example screenshot

2) Single threaded: Force single thread scanning. Else uses a thread per CPU core parallel scanning.
3) Verbose messages: Enable to show additional operational and development messages in IDA's output window.

Buttons

[LOAD ALT RULES]: Click to load another rules file other than the default ("signsrch_le.yar" little endian signsrch based rule set).

[CONTINUE]: Press to start scanning.

After the scanning has completed the rule matches are displayed in an IDA chooser window.
Example results output list:
scan results screenshot

Columns

Address: Virtual address where the rule match is located.
Description: The rule "description" field if the rule has one.
Tags: The rule name tag(s) if it has them.
File: The file where the rule was loaded from.

Motivation

Starting with a user reporting a problem with my IDA signsrch plugin (now deprecated) last year, this set me off on a new path of research. Plus being interested in all things binary signatures, pattern matching, etc., I've been meaning to play with YARA for a while.
I first just planed to fix and upgrade my old Signrch plugin but then some ongoing design considerations lead me to search for possibly other signature/rule sets and/or other search algorithms et al.
YARA seemed fit the bill for a few reasons:

Building

Built using Visual Studio 2022 on Windows 10.
Dependencies:

Setup in the project file, it looks for an environment variable _IDADIR from which it expects to find a "idasdk/include" and a "idasdk/lib" folder where the IDA SDK is located.
Not using IDADIR since IDA looks for it itself and can cause a conflict if you try to use more than one installed IDA version.

Design Notes

There's some existing IDA Python projects using yara-python like findcrypt-yara and findyara-ida, and since the module is binary they are pretty quick. Because of this I almost stopped there since it looked like one of these solutions would fit the bill. But then I wanted to see if I could push the performance envelope further, had to dig into libyara for additional display data anyhow, and needed to add a custom module, I went the full binary C/C++ route. With C/C++ a single thread only yields a small performance gain. But, since I added parallel scanning (using the Windows thread pool API), got speed gains of around a 30% while using complex rules. Currently, since the default Yara4Ida signrch based rule set is all binary signatures types, this parallelism only squeezes about an extra 10% since YARA's efficient Aho-Corasick algorithm pretty much saturates system memory bandwidth with just a single thread already. For more complex rules (with multiple rule parts, using regex, etc.), the extra core compute comes into play.

In switching to YARA I first planed on using existing open source rules like the crypto one mentioned above and some of the others from Yara-Rules.
At first, looked like the crypto rule set had good coverage and fitness, but on further examination I found it gave too many false positives, not nearly as many matches as the signsrch signatures, and a lot of the rules are complex (probably unnecessarily), using regex, etc., which makes the libyara scanning exponentially slower.
I ended up circling back to the awesome signsrch again, making a tool to convert the signatures over to YARA rules. While doing this, I filtered out many signatures including most of the audio, video codec, and game specific ones lowering the total signature count down to about a thousand.

To handle the signsrch "AND" signature type, I created a custom YARA module named "area" since the needed scan behavior couldn't be constructed from YARA rules alone. For this type of search it's a match if a series of either 32bit or 64bit values are all within the same memory range (algorithmic, but within around plus or minus 3000 bytes); perfect for matching certain types of signature patterns.

Performance wise, I found simple binary type signatures to be the best. The Yara4Ida binary signature set (using 8x 5Ghz cores) scans the default ~1000 rules in a large IDA DB in about 1.6 seconds, while it takes 22.5 seconds to scan just the 116 complex "Yara-Rules" crypto ones (14x faster even at an almost 9:1 count ratio!).
See YARA Performance Guidelines for some YARA rule performance tips.

Finally, I removed the default "pe", "elf" and most of the other of the other default libyara modules since as it is. they are unusable from an IDA DB space. Maybe with some work and modification of the modules, it would be possible to make the current loaded IDA DB emulate at lease some of the executable format header types.

Credits

Luigi Auriemma for his unparalleled DB of signatures from his signsrch tool.
Victor M. Alvarez and contributors, for the world-class YARA: The pattern matching swiss knife.
Hex-Rays for IDA Pro, the state-of-the-art binary code analysis tool.

Licenses

Plugin code released under MIT ©2022 By Kevin Weatherman.
Signsrch signature set: ©2013 By Luigi Auriemma, under GPL 2.0 license.
Libyara: ©2007-2016, The YARA Authors, under BSD 3-Clause license.
(See "LICENSE.txt" for more details)