Home

Awesome

MED: Monotonicity Entailment Dataset (Ver.1.0)

Overview

MED is a new evaluation dataset that covers a wide range of monotonicity reasoning that was created by crowdsourcing and collected from linguistics publications. Compared with manual or automatic construction like HELP, we collected naturally-occurring examples by crowdsourcing and well-designed ones from linguistics publications.

Data

The file MED.tsv have 5,382 examples: 4,068 crowd examples and 1,314 paper examples. This file is formatted similarly to the MNLI release, so if your system is trained on MNLI you may be able to feed this file directly into your system (several columns, like label_N and X_parses are empty but kept in order to keep the format of MNLI.) Otherwise, you may need to reformat the data to fit your system's input format.

The fields in this file are:

Citation

If you use this dataset in any published research, please cite the following:

@inproceedings{yanaka-etal-2019-neural,
    title = "Can Neural Networks Understand Monotonicity Reasoning?",
    author = "Yanaka, Hitomi  and
      Mineshima, Koji  and
      Bekki, Daisuke  and
      Inui, Kentaro  and
      Sekine, Satoshi  and
      Abzianidze, Lasha  and
      Bos, Johan",
    booktitle = "Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP",
    year = "2019",
    pages = "31--40",
    }

A part of the MED dataset comes from the FraCaS test suite (Generalized Quantifier section) and the GLUE diagnostic set.

Contact

For questions and usage issues, please contact hitomi.yanaka@riken.jp .

License

CC BY-SA 4.0

Acknowledgement

This work is conducted in collaboration with RIKEN AIP, Ochanomizu University, and University of Groningen.

Disclaimer

A part of the MED dataset is collected from the published works referred to in our paper, and copyright (where applicable) remains with the original authors or publishers.