


This repository contains the code for the paper "Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models" (EMNLP 2021). See full paper here

Note that our finetuning and inference code are adpated from ProtoQA, prefix-tuning code is adapted from PrefixTuning, Autoprompt code is adapated from autoprompt, and protoqa evaluation code is adpated from protoqa-evaluator


This code has been tested on Python 3.7.9, Pytorch 1.7.0 and Transformers 4.2.1, you can install the required packages by

pip install -r requirements.txt

Then we need to install the protoqa-evaluator by cd to protoqa-evaluator

pip install -e protoqa-evaluator


To finetune BART model on ProtoQA task, cd to src/Finetune and run

bash finetune.sh 

This would also run inference on ProtoQA dev set, you can find the results in results.json under your specified output directory.

For GPT2 model, you can simply update the --model_type and --model_name_or_path


For CommonGen experiments, the command can also be found in finetune.sh, you can just uncomment it and run. The results can be found in the eval_generation.txt in the output directory after training.

To evalutate the generated output, please follow the CommonGen official repo to set up a new environment and evaluate.


Under src/Prefix_tuning and run

bash prefixtune.sh 


Under src/Autoprompt and run

bash autoprompt.sh 

Manual Annotation

For the 30 selected questions and 30 newly annotated questions, as well as their model predictions can be found in Data/Manual_annotation


