Home

Awesome

Statutory Interpretation Data Set

This repository contains the data set created for the following research papers:

Savelka, Jaromir, and Kevin D. Ashley. "Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models." Findings of the Association for Computational Linguistics: EMNLP 2021. 2021.

Jaromir Savelka, Huihui Xu, and Kevin D. Ashley. 2019. Improving Sentence Retrieval from Case Law for Statutory Interpretation. In Seventeenth International Conference on Artificial Intelligence and Law (ICAIL ’19), June 17–21, 2019, Montreal, QC, Canada, Floris Bex (Ed.). ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3322640.3326736

Task

Given a statutory provision, user's interest in the meaning of a phrase from the provision, and a list of sentences we would like to rank more highly the sentences that elaborate upon the meaning of the statutory phrase of interest, such as:

Corpus Overview

For this corpus we selected fourty two terms from different provisions of the United States Code.

For each term we have collected a set of sentences by extracting all the sentences mentioning the term from the court decisions retrieved from the Caselaw access project data.

In total the corpus consists of 26,959 sentences.

The sentences are classified into four categories according to their usefulness for the interpretation:

See Annotation guidelines for additional details.

Data Structure

Each zip file contains data related to one of the fourty two queries. There are four files in total containing the texts of different granularity. These allow to replicate experiments reported in the paper cited above.

Terms of Use

For use of the data we kindly ask you to provide the two following attributions:

Savelka, Jaromir, and Kevin D. Ashley. "Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models." Findings of the Association for Computational Linguistics: EMNLP 2021. 2021.

The President and Fellows of Harvard University, Caselaw access project, Caselaw access project, 2018.