Home

Awesome

Defects4J -- version 3.0.1 .github/workflows/ci.yml

Defects4J is a collection of reproducible bugs and a supporting infrastructure with the goal of advancing software engineering research.

Contents of Defects4J

The projects

Defects4J contains 854 bugs (plus 10 deprecated bugs) from the following open-source projects:

IdentifierProject nameNumber of active bugsActive bug idsDeprecated bug ids (*)
Chartjfreechart261-26None
Clicommons-cli391-5,7-406
Closureclosure-compiler1741-62,64-92,94-17663,93
Codeccommons-codec181-18None
Collectionscommons-collections281-28None
Compresscommons-compress471-47None
Csvcommons-csv161-16None
Gsongson181-18None
JacksonCorejackson-core261-26None
JacksonDatabindjackson-databind1101-64,66-88,90-11265,89
JacksonXmljackson-dataformat-xml61-6None
Jsoupjsoup931-93None
JxPathcommons-jxpath221-22None
Langcommons-lang611,3-17,19-24,26-47,49-652,18,25,48
Mathcommons-math1061-106None
Mockitomockito381-38None
Timejoda-time261-20,22-2721

* Due to behavioral changes introduced in newer Java versions, some bugs are no longer reproducible. Hence, Defects4J distinguishes between active and deprecated bugs:

We do not re-enumerate active bugs because publications using Defects4J artifacts usually refer to bugs by their specific bug id.

The bugs

Each bug has the following properties:

The (b)uggy and (f)ixed program revisions are labelled with <id>b and <id>f, respectively (<id> is an integer).

Reproducibility

Java version

All bugs have been reproduced and triggering tests verified, using Java 11 (see the CI configuration for specifics). Using a different version of Java might result in unexpected failing tests and/or non-reproducible bugs.

Timezone

Defects4J generates and executes tests in the timezone America/Los_Angeles.

If you are using the bugs outside of the Defects4J framework, set the TZ environment variable to America/Los_Angeles and export it. Not setting this option results in unexpected failing tests!

Broken and Flaky Tests

Defects4J excludes broken tests (tests that reliably fail on the fixed and the buggy version) and flaky tests (tests that intermittently fail on the fixed or buggy version).

If you are using the bugs outside of the Defects4J framework, make sure to verify expected test behavior for your environment.

Metadata

We recommend using Defects4J through the provided command-line interface. All reproducibility tests within Defects4J rely on this interface. If you are using the bugs outside of the Defects4J framework, make sure to use the Defects4J export command to obtain relevant metadata such as source/test directories, classpath entries, and sets of tests.

Setting up Defects4J

Requirements

Defects4J version 2.x required Java 1.8.

Defects4J version 1.x and 0.x required Java 1.7.

Perl dependencies

All required Perl modules are listed in cpanfile. On many Unix platforms, these required Perl modules are installed by default. The setup instructions immediately below install them if necessary. If you do not have cpanm installed, use cpan or a cpan wrapper to install the perl modules listed in cpanfile.

Steps to set up Defects4J

  1. Clone Defects4J:

    • git clone https://github.com/rjust/defects4j
  2. Initialize Defects4J (download the project repositories and external libraries, which are not included in the git repository for size purposes and to avoid redundancies):

    • cd defects4j
    • cpanm --installdeps .
    • ./init.sh
  3. Add Defects4J's executables to your PATH:

    • export PATH=$PATH:"path2defects4j"/framework/bin ("path2defects4j" points to the directory to which you cloned Defects4J; it looks like "/user/yourComputerUserName/desktop/defects4j".)
  4. Check installation:

    • defects4j info -p Lang

On some platforms such as Windows, you might need to use perl "fullpath"\defects4j where these instructions say to use defects4j.

Using Defects4J

Example commands

  1. Get information for a specific project (commons lang):

    • defects4j info -p Lang
  2. Get information for a specific bug (commons lang, bug 1):

    • defects4j info -p Lang -b 1
  3. Checkout a buggy source code version (commons lang, bug 1, buggy version):

    • defects4j checkout -p Lang -v 1b -w /tmp/lang_1_buggy
  4. Change to the working directory, compile sources and tests, and run tests:

    • cd /tmp/lang_1_buggy
    • defects4j compile
    • defects4j test
  5. Some Defects4J commands take the project id as a command-line argument (possibly along with other arguments). Examples include info, checkout, and query. Note that info and query report information that is derived from the Defects4J metadata and do not require access to project files that are in the project's VCS.

    Other commands require a working directory, either set explicitly (-w command-line argument) or implicitly (executed from within a working directory). Examples include any command that requires access to files under version control, including source code and build files. This includes all commands that build or test the code (compile, test, coverage, mutation) and commands that return version-specific information (export).

  6. The scripts in framework/test/ are examples of how to use Defects4J, which you might find useful as inspiration when you are writing your own scripts that use Defects4J.

Command-line interface: defects4j command

Use framework/bin/defects4j to execute any of the following commands:

CommandDescription
infoView configuration of a specific project or summary of a specific bug
envPrint the environment of defects4j executions
checkoutCheckout a buggy or a fixed project version
compileCompile sources and developer-written tests of a buggy or a fixed project version
testRun a single test method or a test suite on a buggy or a fixed project version
mutationRun mutation analysis on a buggy or a fixed project version
coverageRun code coverage analysis on a buggy or a fixed project version
monitor.testMonitor the class loader during the execution of a single test or a test suite
bidsPrint the list of active or deprecated bug IDs for a specific project
pidsPrint a list of available project IDs
exportExport version-specific properties such as classpaths, directories, or lists of tests
queryQuery the metadata to generate a CSV file of requested information for a specific project

Export version-specific properties

Use defects4j export -p <property_name> [-o output_file] in the working directory to export a version-specific property:

PropertyDescription
classes.modifiedClasses modified by the bug fix
classes.relevantClasses loaded by the JVM when executing all triggering tests
cp.compileClasspath to compile and run the project
cp.testClasspath to compile and run the developer-written tests
dir.src.classesSource directory of classes (relative to working directory)
dir.bin.classesTarget directory of classes (relative to working directory)
dir.src.testsSource directory of tests (relative to working directory)
dir.bin.testsTarget directory of test classes (relative to working directory)
tests.allList of all developer-written test classes
tests.relevantList of relevant tests classes (a test class is relevant if, when executed, the JVM loads at least one of the modified classes)
tests.triggerList of test methods that trigger (expose) the bug

Export project-specific metadata

Use defects4j query -p <pid> -q <field_list> [-o <output_file>] [-D|-A] to generate a CSV file containing a set of requested metadata for each bug in a specific project.

By default, defects4j query returns a list of active bug IDs for a project. To request specific metadata, the -q flag should be provided with a comma-separated list of variables from the list below. For example, defects4j query -p Chart -q "report.id,report.url" will provide the a list of all active bug IDs, along with the bug report ID and bug report URL for each.

PropertyDescription
bug.idAssigned bug IDs (included in all results)
project.idAssigned project ID
project.nameOriginal project name
project.build.fileLocation of the Defects4J build file for the project
project.vcsVersion control system used by the project
project.repositoryLocation of the project repository
project.bugs.csvLocation of the CSV containing information on that bug
revision.id.buggyCommit hashes for the buggy version of each bug
revision.id.fixedCommit hashes for the fixed version of each bug
revision.date.buggyDate of the buggy commit for each bug
revision.date.fixedDate of the fixed commit for each bug
report.idBug report ID from the version tracker for each bug
report.urlBug report URL from the version tracker for each bug
classes.modifiedClasses modified by the bug fix
classes.relevant.srcSource classes loaded by the JVM when executing all triggering tests
classes.relevant.testTest classes loaded by the JVM when executing all triggering tests
tests.relevantList of relevant tests classes (a test class is relevant if, when executed, the JVM loads at least one of the modified classes)
tests.triggerList of test methods that trigger (expose) the bug, separated by semicolons (;)
tests.trigger.causeList of test methods that trigger (expose) the bug, along with the exception thrown. Each list element has the form "methodName --> exceptionClass[: message]", and list elements are separated by semicolons (;)
deprecated.version(for deprecated bugs only) Version of Defects4J where a bug was deprecated
deprecated.reason(for deprecated bugs only) Reason for deprecation

By default, defects4j query returns information on active bugs. The [-D] flag returns information only on deprecated bugs, while the [-A] flag returns information for all active and deprecated bugs.

To determine the methods that are changed between the buggy and fixed version of the code:

  1. Add this line to your user-level git attributes file: *.java diff=java

  2. Run git diff --no-index, for example git diff --no-index /tmp/lang_1_buggy /tmp/lang_1_fixed. In the output, every line starting with "@" gives the method name of a changed method.

Test execution framework

The test execution framework for generated test suites (framework/bin) provides the following scripts:

ScriptDescription
defects4jMain script, described above
gen_testsGenerate test suites using EvoSuite or Randoop
run_bug_detectionDetermine the real fault detection rate
run_mutationDetermine the mutation score
run_coverageDetermine code coverage ratios (statement and branch coverage)

Mining and contributing additional bugs to Defects4J

We welcome your contributions to Defects4J! The bug-mining README details the bug-mining process.

Additional resources

Scripts built on Defects4J

Fault localization (FL)

Automated program repair (APR)

Publications

More publications

Implementation details

Documentation for any script or module is available as HTML documentation.

The directory structure of Defects4J is as follows:

defects4j
   |
   |--- project_repos:     The version control repositories of the provided projects (populated during initialization).
   |
   |--- developer:         Resources for Defects4J contributors.
   |
   |--- framework:         Libraries and executables of the core, test execution,
       |                   and bug-mining frameworks.
       |
       |--- bin:           Command line interface to Defects4J.
       |
       |--- bug-mining:    Bug-mining framework.
       |
       |--- core:          The modules of the core framework.
       |
       |--- doc:           Scripts and templates for the html documentation.
       |
       |--- lib:           Libraries used in the core framework.
       |
       |--- projects:      Project-specific resource files.
       |
       |--- test:          Scripts to test the framework.
       |
       |--- util:          Util scripts used by Defects4J.
       

Versioning information

Defects4J uses a semantic versioning scheme (major.minor.patch):

Changemajorminorpatch
Addition/Deletion of bugsX
New/upgraded internal or external toolsX
Fixes and documentation changesX

License

MIT License, see license.txt for more information.