

Evaluating German Transformer Language Models with Syntactic Agreement Tests

Code and data for the paper by Karolina Zaczynska, Nils Feldhus, Robert Schwarzenberg, Aleksandra Gabryszak and Sebastian Möller: https://arxiv.org/abs/2007.03765
It originally appeared in the proceedings of the Swiss Text Analytics Conference & Conference on Natural Language Processing (KONVENS) 2020: http://ceur-ws.org/Vol-2624/paper7.pdf
We recommend to refer to the more recent arXiv publication, because it includes minor adjustments.


See the data folder README for more information.



Run tests with LMs

Execute python run_probing_experiment.py with the following flags:

Run evaluation on test outputs to produce accuracy scores

Execute python evaluation.py with the following flags: