Awesome
JQF + Zest: Semantic Fuzzing for Java
JQF is a feedback-directed fuzz testing platform for Java (think: AFL/LibFuzzer but for JVM bytecode). JQF uses the abstraction of property-based testing, which makes it nice to write fuzz drivers as parameteric JUnit test methods. JQF is built on top of junit-quickcheck. JQF enables running junit-quickcheck style parameterized unit tests with the power of coverage-guided fuzzing algorithms such as Zest.
Zest is an algorithm that biases coverage-guided fuzzing towards producing semantically valid inputs; that is, inputs that satisfy structural and semantic properties while maximizing code coverage. Zest's goal is to find deep semantic bugs that cannot be found by conventional fuzzing tools, which mostly stress error-handling logic only. By default, JQF runs Zest via the simple command: mvn jqf:fuzz
.
JQF is a modular framework, supporting the following pluggable fuzzing front-ends called guidances:
- Binary fuzzing with AFL (tutorial)
- Semantic fuzzing with Zest [ISSTA'19 paper] (tutorial 1) (tutorial 2)
- Complexity fuzzing with PerfFuzz [ISSTA'18 paper]
- Reinforcement learning with RLCheck (based on a fork of JQF) [ICSE'20 paper]
- Mutation-analysis-guided fuzzing with Mu2 [ISSTA'23 paper]
JQF has been successful in discovering a number of bugs in widely used open-source software such as OpenJDK, Apache Maven and the Google Closure Compiler.
Zest Research Paper
To reference Zest in your research, we request you to cite our ISSTA'19 paper:
Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTAβ19), July 15β19, 2019, Beijing, China. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3293882.3330576
JQF Tool Paper
If you are using the JQF framework to build new fuzzers, we request you to cite our ISSTA'19 tool paper as follows:
Rohan Padhye, Caroline Lemieux, and Koushik Sen. 2019. JQF: Coverage-Guided Property-Based Testing in Java. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA β19), July 15β19, 2019, Beijing, China. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3293882.3339002
Overview
What is structure-aware fuzzing?
Binary fuzzing tools like AFL and libFuzzer treat the input as a sequence of bytes. If the test program expects highly structured inputs, such as XML documents or JavaScript programs, then mutating byte-arrays often results in syntactically invalid inputs; the core of the test program remains untested.
Structure-aware fuzzing tools leverage domain-specific knowledge of the input format to produce inputs that are syntactically valid by construction. There are some nice articles on structure-aware fuzzing of C++ and Rust programs using libFuzzer.
What is generator-based fuzzing (QuickCheck)?
Structure-aware fuzzing tools need a way to understand the input structure. Some other tools use declarative specifications of the input format such as context-free grammars or protocol buffers. JQF uses QuickCheck's imperative approach for specifying the space of inputs: arbitrary generator programs whose job is to generate a single random input.
A Generator<T>
provides a method for producing random instances of type T
. For example, a generator for type Calendar
returns randomly-generated Calendar
objects. One can easily write generators for more complex types, such as
XML documents,
JavaScript programs,
JVM class files, SQL queries, HTTP requests, and many more -- this is generator-based fuzzing. However, simply sampling random inputs of type T
is not usually very effective, since the generator does not know if the inputs that it produces are any good.
What is semantic fuzzing (Zest)?
JQF supports the Zest algorithm, which uses code-coverage and input-validity feedback to bias a QuickCheck-style generator towards generating structured inputs that can reveal deep semantic bugs. JQF extracts code coverage using bytecode instrumentation, and input validity using JUnit's Assume
API. An input is valid if no assumptions are violated.
Example
Here is a JUnit-Quickcheck test for checking a property of the PatriciaTrie class from Apache Commons Collections. The property tests that if a PatriciaTrie
is initialized with an input JDK Map
, and if the input map already contains a key, then that key should also exist in the newly constructed PatriciaTrie
.
@RunWith(JQF.class)
public class PatriciaTrieTest {
@Fuzz /* The args to this method will be generated automatically by JQF */
public void testMap2Trie(Map<String, Integer> map, String key) {
// Key should exist in map
assumeTrue(map.containsKey(key)); // the test is invalid if this predicate is not true
// Create new trie with input `map`
Trie trie = new PatriciaTrie(map);
// The key should exist in the trie as well
assertTrue(trie.containsKey(key)); // fails when map = {"x": 1, "x\0": 2} and key = "x"
}
}
Running mvn jqf:fuzz
causes JQF to invoke the testMap2Trie()
method repeatedly with automatically generated values for map
and key
. After about 5 seconds on average (~5,000 inputs), JQF will report an assertion violation. It finds a bug in the implementation of PatriciaTrie
that is unresolved as of v4.4. Random sampling of map
and key
values is unlikely to find the failing test case, which is a very special corner case (see the comments next to the assertion in the code above). JQF finds this violation easily using a coverage-guided called Zest. To run this example as a standalone Maven project, check out the jqf-zest-example repository.
In the above example, the generators for Map
and String
were synthesized automatically by JUnitQuickCheck. It is also possible to specify generators for structured inputs manually. See the tutorials below.
Documentation
- The JQF Maven Plugin documentation shows how to run
mvn jqf:fuzz
andmvn jqf:repro
. - Writing a JQF Test demonstrates the creation of a JUnit-based parameterized test method for JQF.
- The Guidance interface docs show how JQF works internally, which is useful for researchers wishing to build custom guidance algorithms on top of JQF.
- API docs are published at every major release, which is again useful for researchers wishing to extend JQF.
Tutorials
- Zest 101: A basic tutorial for fuzzing a standalone toy program using command-line scripts. Walks through the process of writing a test driver and structured input generator for
Calendar
objects. - Fuzzing a compiler with Zest: A tutorial for fuzzing a non-trivial program -- the Google Closure Compiler -- using a generator for JavaScript programs. This tutorial makes use of the JQF Maven plugin.
- Fuzzing with AFL: A tutorial for fuzzing a Java program that parses binary data, such as PNG image files, using the AFL binary fuzzing engine.
- Fuzzing with ZestCLI: A tutorial of fuzzing a Java program with ZestCLI
Continuous Fuzzing
GitLab supports running JQF in CI/CD (tutorial), though they have recently rolled out their own custom Java fuzzer for this purpose.
Research and Tools based on JQF
- Zest π [ISSTA'19 paper] - Semantic Fuzzing
- BigFuzz π [ASE'20 paper] - Spark Fuzzing
- MoFuzz [ASE'20 paper] - Model-driven software
- RLCheck π [ICSE'20 paper] - Reinforcement learning
- Bonsai π [ICSE'21 paper] - Concise test generation
- Confetti [ICSE'22 paper] - Concolic / taint tracking with global hinting
- BeDivFuzz [ICSE'22 paper]- Behaviorial diversity
- ODDFuzz [IEEE S&P'23 paper] - Deserialization vulnerabilities
- GCMiner [ICSE'23 paper] - Gadget chain mining
- Intender [USENIX Security'23 paper] - Intent-based networking
- Mu2 π [ISSTA'23 paper] - Mutation testing as guidance
- TOAST [JCST'22 paper] - Testing dynamic software updates
- Poracle [TOSEM'23 paper] - Patch testing using differential fuzzing
- SPIDER π [arxiv preprint] - Stateful performance issues in SDN
- FuzzDiff [Dissertation] - Dynamic program equivalence checking
π = Involves at least one of the original JQF authors.
Contact the developers
If you've found a bug in JQF or are having trouble getting JQF to work, please open an issue on the issue tracker. You can also use this platform to post feature requests.
If it's some sort of fuzzing emergency you can always send an email to the main developer: Rohan Padhye.
Trophies
If you find bugs with JQF and you comfortable with sharing, We would be happy to add them to this list. Please send a PR for README.md with a link to the bug/cve you found.
- google/closure-compiler#2842: IllegalStateException in VarCheck: Unexpected variable
- google/closure-compiler#2843: NullPointerException when using Arrow Functions in dead code
- google/closure-compiler#3173: Algorithmic complexity / performance issue on fuzzed input
- google/closure-compiler#3220: ExpressionDecomposer throws IllegalStateException: Object method calls can not be decomposed
- JDK-8190332: PngReader throws NegativeArraySizeException when width is too large
- JDK-8190511: PngReader throws OutOfMemoryError for very small malformed PNGs
- JDK-8190512: PngReader throws undocumented IllegalArgumentException: "Empty Region" instead of IOException for malformed images with negative dimensions
- JDK-8190997: PngReader throws NullPointerException when PLTE section is missing
- JDK-8191023: PngReader throws NegativeArraySizeException in parse_tEXt_chunk when keyword length exceeeds chunk size
- JDK-8191076: PngReader throws NegativeArraySizeException in parse_zTXt_chunk when keyword length exceeds chunk size
- JDK-8191109: PngReader throws NegativeArraySizeException in parse_iCCP_chunk when keyword length exceeds chunk size
- JDK-8191174: PngReader throws undocumented llegalArgumentException with message "Pixel stride times width must be <= scanline stride"
- JDK-8191073: JpegImageReader throws IndexOutOfBoundsException when reading malformed header
- JDK-8193444: SimpleDateFormat throws ArrayIndexOutOfBoundsException when format contains long sequences of unicode characters
- JDK-8193877: DateTimeFormatterBuilder throws ClassCastException when using padding
- mozilla/rhino#405: FAILED ASSERTION due to malformed destructuring syntax
- mozilla/rhino#406: ClassCastException when compiling malformed destructuring expression
- mozilla/rhino#407: java.lang.VerifyError in bytecode produced by CodeGen
- mozilla/rhino#409: ArrayIndexOutOfBoundsException when parsing '<!-'
- mozilla/rhino#410: NullPointerException in BodyCodeGen
- COLLECTIONS-714: PatriciaTrie ignores trailing null characters in keys
- COMPRESS-424: BZip2CompressorInputStream throws ArrayIndexOutOfBoundsException(s) when decompressing malformed input
- LANG-1385: StringIndexOutOfBoundsException in NumberUtils.createNumber
- CVE-2018-11771: Infinite Loop in Commons-Compress ZipArchiveInputStream (found by Tobias Ospelt)
- MNG-6375 / plexus-utils#34: NullPointerException when pom.xml has incomplete XML tag
- MNG-6374 / plexus-utils#35: ModelBuilder hangs with malformed pom.xml
- MNG-6577 / plexus-utils#57: Uncaught IllegalArgumentException when parsing unicode entity ref
- Bug 62655: Augment task: IllegalStateException when "id" attribute is missing
- BCEL-303: AssertionViolatedException in Pass 3A Verification of invoke instructions
- BCEL-307: ClassFormatException thrown in Pass 3A verification
- BCEL-308: NullPointerException in Verifier Pass 3A
- BCEL-309: NegativeArraySizeException when Code attribute length is negative
- BCEL-310: ArrayIndexOutOfBounds in Verifier Pass 3A
- BCEL-311: ClassCastException in Verifier Pass 2
- BCEL-312: AssertionViolation: INTERNAL ERROR Please adapt StringRepresentation to deal with ConstantPackage in Verifier Pass 2
- BCEL-313: ClassFormatException: Invalid signature: Ljava/lang/String)V in Verifier Pass 3A
- CVE-2018-8036: Infinite Loop leading to OOM in PDFBox's AFMParser (found by Tobias Ospelt)
- PDFBOX-4333: ClassCastException when loading PDF (found by Robin Schimpf)
- PDFBOX-4338: ArrayIndexOutOfBoundsException in COSParser (found by Robin Schimpf)
- PDFBOX-4339: NullPointerException in COSParser (found by Robin Schimpf)
- CVE-2018-8017: Infinite Loop in IptcAnpaParser
- CVE-2018-12418: Infinite Loop in junrar (found by Tobias Ospelt)
- CVE-2019-17359: Attempt to trigger a large allocation leads to OOM in Bouncycastle ASN.1 parser (found by Tobias Ospelt)