Home

Awesome

Attention Is (not) All You Need for Commonsense Reasoning

made-with-python License REUSE status

News

Description:

Schematic Illustration MAS The recently introduced BERT (Deep Bidirectional Transformers for Language Understanding) [1] model exhibits strong performance on several language understanding benchmarks. In this work, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem (PDP) and Winograd Schema Challenge (WSC). Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora. The sample code provided within this repository allows to replicate the results reported in the paper for PDP and WSC.

Authors:

Requirements

Download and Installation

  1. Install BertViz by cloning the repository and getting dependencies:
git clone https://github.com/jessevig/bertviz.git
cd bertviz
pip install -r requirements.txt
cd ..
  1. To replicate the results proceed to step 3). If you want to run the stand-alone version, you can just use MAS.py. Usage is showcased in the Jupyter Notebook example MAS_Example.ipynb.

  2. Add BertViz path to Python path:

  export PYTHONPATH=$PYTHONPATH:/home/ubuntu/bertviz/

alternatively, you can add the statement to commonsense.py after importing of sys, e.g.

sys.path.append("/home/ubuntu/bertviz/")
  1. Clone this repository and install dependencies:
git clone https://github.com/SAP/acl2019-commonsense-reasoning
cd acl2019-commonsense-reasoning
pip install -r requirements.txt
  1. Create 'data' sub-directory and download files for PDP and WSC challenges:
mkdir data
wget https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/PDPChallenge2016.xml
wget https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WSCollection.xml
cd ..
  1. Run the scripts from the paper

For replicating the results on WSC:

python commonsense.py --data_dir=~/acl2019-commonsense-reasoning/data/ --bert_mode=bert-base-uncased --do_lower_case --task_name=MNLI

For replicating the results on PDP:

python commonsense.py --data_dir=~/acl2019-commonsense-reasoning/data/ --bert_mode=bert-base-uncased --do_lower_case --task_name=pdp

For more information on the individual functions, please refer to their doc strings.

Related Work

See our latest work accepted at ACL'20 on commonsense reasoning using contrastive self-supervised learning. arXiv, GitHub

Known Issues

No issues known

How to obtain support

This project is provided "as-is" and any bug reports are not guaranteed to be fixed.

Citations

If you use this code in your research, please cite:

@inproceedings{klein-nabi-2019-attention,
    title = "Attention Is (not) All You Need for Commonsense Reasoning",
    author = "Klein, Tassilo  and
      Nabi, Moin",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1477",
    doi = "10.18653/v1/P19-1477",
    pages = "4831--4836",
    abstract = "The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora.",
}

References

License

Copyright (c) 2024 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the LICENSE.