Awesome
PAIE (Prompting Argument Interaction for Event Argument Extraction)
This is the implementation of the paper Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction. ACL'2022.
Quick links
Overview
In this work we present PAIE: a simple, effective and low resource-required approach for sentence-/document-level event argument extraction. We formulate our contribution as follow.
- We formulate and investigate prompt tuning under extractive settings.
- We extract multiple roles using a joint prompt once a time. It not only considers the interaction among different roles but also reduce time complexity significantly.
Preparation
Environment
To run our code, please install all the dependency packages by using the following command:
pip install -r requirements.txt
Data
We conduct experiments on three common datasets: ACE05, RAMS and WIKIEVENTS.
- ACE05: This dataset is not freely available. Access from LDC and preprocessing following EEQA (2020'EMNLP).
- RAMS / WIKIEVENTS: We write a script for you for data processing. Run the following commands in the root directory of the repo.
bash ./data/download_dataset.sh
Please make sure your data folder structure as below.
data
├── ace_eeqa
│ ├── train_convert.json
│ ├── dev_convert.json
│ └── test_convert.json
├── RAMS_1.0
│ └── data
│ ├── train.jsonlines
│ ├── dev.jsonlines
│ └── test.jsonlines
├── WikiEvent
│ └── data
│ ├── train.jsonl
│ ├── dev.jsonl
│ └── test.jsonl
├── prompts
│ ├── prompts_ace_full.csv
│ ├── prompts_wikievent_full.csv
│ └── prompts_rams_full.csv
└── dset_meta
├── description_ace.csv
├── description_rams.csv
└── description_wikievent.csv
Run the model
Quick start
You could simply run PAIE with following commands:
bash ./scripts/train_{ace|rams|wikievent}.sh
Folders will be created automatically to store:
- Subfolder
checkpoint
: model parameters with best dev set result - File
log.txt
: recording hyper-parameters, training process and evaluation result - File
best_dev_results.log
/best_test_related_results.log
: showing prediction results of checkpoints on every sample in dev/test set.
You could see hyperparameter setting in ./scripts/train_[dataset].sh
and config_parser.py
. We give most of hyperparameters a brief explanation in config_parser.py
.
Above three scripts train models with BART-base. If you want to train models with BART-Large, please change --model_name_or_path
from facebook/bart-base
to facebook/bart-large
or run following commands:
bash ./scripts/train_{ace|rams|wikievent}_large.sh
Experiments with multiple runs
Table.3 of our paper shows the fluctuation of results due to random seed and other hyperparameters (learning rate mainly). You could run experiments multiple times to get a more stable and reliable results.
for seed in 13 21 42 88 100
do
for lr in 1e-5 2e-5 3e-5 5e-5
do
bash ./scripts/train_{ace|rams|wikievent}.sh $seed $lr
done
done
Each run will take ~4h so we highly recommend you to execute above command in parallel way.
Without-bipartite-loss
You could run PAIE without bipartite matching loss by delete the command argument --bipartite
or run following commands:
bash ./scripts/train_{ace|rams|wikievent}_nobipartite.sh
Joint-prompt-or-not
Unlike multiple prompt strategy in PAIE, you could also prompt argument using template containing only one role (single prompt). Try it by changing --model_type
from paie
to base
and set proper hyperparameters: --max_span_num
, --max_dec_seq_length
and --th_delta
. Alternatively you could run following commands directly with hyperparameters we tuned:
bash ./scripts/train_{ace|rams|wikievent}_singleprompt.sh
Manual-prompt-or-others
Besides manual prompt, provide another two joint-prompt choices as described in Section 3.2 of our paper. We concelude them in the following:
- (Default setting) Manual Prompt: All roles are connected manually with natural language
- Concatenation Prompt: To concatenate all role names belonging to one event type.
- Soft Prompt: Following previous work about continuous prompt, we connect different roles with learnable, role-specific pseudo tokens.
Run following commands if you want to try Concatenation Prompt:
bash ./scripts/train_{ace|rams|wikievent}_concatprompt.sh
Run following commands if you want to try Soft Prompt:
bash ./scripts/train_{ace|rams|wikievent}_softprompt.sh
Few-shot-setting
PAIE also performs well under low-annotation scenario. You could try it by set hyperparameters --keep_ratio
to a number between 0 to 1, which controls the resampling rate from the original training examples. Simply you could also run scripts below:
KEEP_RATIO=0.2 bash ./scripts/train_{ace|rams|wikievent}_fewshot.sh
Note you could adjust the KEEP_RATIO
value by yourself.
Citation
Please cite our paper if you use PAIE in your work:
@inproceedings{ma-etal-2022-prompt,
title = "{P}rompt for Extraction? {PAIE}: {P}rompting Argument Interaction for Event Argument Extraction",
author = "Ma, Yubo and
Wang, Zehao and
Cao, Yixin and
Li, Mukai and
Chen, Meiqi and
Wang, Kun and
Shao, Jing",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = may,
year = "2022",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.acl-long.466",
doi = "10.18653/v1/2022.acl-long.466",
pages = "6759--6774",
}