Awesome
Negated and Misprimed Probes for Pretrained Language Models
LAMA (LAnguage Model Analysis) is a probe for analyzing facutal knowledge caputred by pretrained language models, see the following paper:
@inproceedings{petroni2019language,
title={Language Models as Knowledge Bases?},
author={F. Petroni, T. Rockt{\"{a}}schel, A. H. Miller, P. Lewis, A. Bakhtin, Y. Wu and S. Riedel},
booktitle={In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019},
year={2019}
}
@inproceedings{kassner-schutze-2020-negated,
title = "Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly",
author = {Kassner, Nora and
Sch{\"u}tze, Hinrich},
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.698"}
This repository extends the original LAMA probe by negated and misprimed probes. The negated probes are integrated into Facebook's original repository the misrpimed data can be downloaded here.
1. Download Scripts
git clone https://github.com/facebookresearch/LAMA.git
pip install -r requirements.txt
2. Download the models
~55 GB on disk
Install spacy model
python3 -m spacy download en
Download the models
chmod +x download_models.sh
./download_models.sh
The script will create and populate a pre-trained_language_models folder. If you are interested in a particular model please edit the script.
3. Negated LAMA
Download the data via Facebook:
wget https://dl.fbaipublicfiles.com/LAMA/negated_data.tar.gz
tar -xzvf negated_data.tar.gz
rm negated_data.tar.gz
Set the set the flag use_negated_probes
in scripts/run_experiments.py
.
4. Misprimed Data
Download the data from this repository:
git clone https://github.com/norakassner/LAMA_primed_negated.git