Awesome
Multi-turn Response Selection
Paper: Multi-turn Response Selection using Dialogue Dependency Relations has been accepted by EMNLP 2020 (link).
Response selection
Download the dataset
Download the discourse parsed DSTC7 datasets from Google Drive and unzip into ~/code/data.
The other other two dataset used in this paper can be downloaded here:
- UbuntuV2 or from Google Drive
- DSTC8* or from Google Drive
Getting Started
The code for this paper is based on ParlAI. The original codes this repository based on can be downloaded here.
To build the environment with anaconda:
conda create -n Thread-Enc python=3.7 pytorch=1.1
conda activate Thread-Enc
pip install -r requirements.txt
Run the following commands to setup the code:
cd ~/code;
python setup.py develop
Run this command to finetune Thread-bi encoder (e.g. DSTC7):
python3 -u examples/train_model.py --init-model zoo:pretrained_transformers/poly_model_huge_reddit/model --shuffle True --eval-batchsize 4 --batchsize 32 --model transformer/parencoder --warmup_updates 100 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 -vp 3 -veps 0.5 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates batch --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction_type mean --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --share-word-embeddings False --dict-endtoken __start__ -pyt par_dstc7 --fp16 False --par_type basic --par_num 4 --reduction-type mean --parencoder-type codes --model-file ./thread_bi_dstc7
Run this command to finetune Thread-poly encoder(e.g. DSTC7):
python3 -u examples/train_model.py --init-model zoo:pretrained_transformers/poly_model_huge_reddit/model -pyt par_dstc7 --eval-batchsize 4 --batchsize 32 --model transformer/parpolyencoder --warmup_updates 100 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 -vp 3 -veps 0.5 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates batch --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction_type mean --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --share-word-embeddings False --dict-endtoken __start__ --fp16 False --polyencoder-type codes --codes-attention-type basic --poly-n-codes 64 --poly-attention-type basic --polyencoder-attention-keys context --par_type basic --par_num 4 --reduction-type mean --parencoder-type codes --model-file ./thread_poly_dstc7
The comparison of baselines and our models on DSTC7 are shown as follows:
hits@1 | hits@10 | hits@50 | MRR | |
---|---|---|---|---|
DAM (Zhou et al., 2018) | 34.7 | 66.3 | - | 35.6 |
ESIM-18 (Dong and Huang, 2018) | 50.1 | 78.3 | 95.4 | 59.3 |
ESIM-19 (Chen and Wang, 2019) | 64.5 | 90.2 | 99.4 | 73.5 |
Bi-Enc (Humeau et al., 2019) | 70.9 | 90.6 | - | 78.1 |
Poly-Enc (Humeau et al., 2019) | 70.9 | 91.5 | - | 78.0 |
Cross-Enc (Humeau et al., 2019) | 71.7 | 92.4 | - | 79.0 |
Thread-bi | 73.3 | 92.5 | 99.3 | 80.2 |
Thread-poly | 73.2 | 93.6 | 99.1 | 80.4 |
Dependency Parsing
The new dataset we used to train the dependency parsing model is transformed from the dataset proposed in "A large-scale corpus for conversation disentanglement".
The new dataset can be downloaded here. It includes:
- new_ubuntu_train.json: Transformed from the original training set, and also used as the training set in our paper.
- new_ubuntu_dev.json: Transformed from the original development set.
- new_ubuntu_test.json: Transformed from the orignial test set.
- new_ubuntu_final_test.json: The merge of new_ubuntu_dev.json and new_ubuntu_test.json. Used as the test set in our paper.
It should be noted that we only use this model to predict if there exists a dependency relation between two utterances and ignored the relation types. The "type" of each relation in our generated dataset is meaningless.