Home

Awesome

PolyAI

EVI

This repo contains the code and data of our publication:

EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification

Dataset

This repo contains a challenging spoken multilingual dataset with 5,506 dialogues in English, Polish, and French that can be used for benchmarking and developing knowledge-based enrolment, identification, and identification for spoken dialogue systems. The data include the ASR n-best list transcriptions and can be used to replicate the results in the paper.

counts (unique)en-GBpl-PLfr-FR
Knowledge Base#profiles10,00010,00010,000
#postcodes2,0002,0002,000
#names(first)364153216
#names(last)5003,455400
#names(full)9,4129,9239,433
#DoBs8,8848,8628,862
Dialogues#dialogues1,4071,9912,108
#turns12,66317,91918,972
#speakers1,081803521
#profiles8869611,464

Raw audios are available to download here, in case you want to experiment with different ASR systems.

Benchmarks

This repo includes all scripts to replicate the results of experiments in the paper.

Setup

The following scripts assume python 3.9.

Install all requirments:

pip install -r requirements.txt

Enrolment Experiments

python eval_e.py --locale en_GB --nlu cautious
python eval_e.py --locale en_GB --nlu seeking

Analysis for multi- vs single-turn:

python eval_e.py --locale en_GB --nlu cautious --model 0  # multi
python eval_e.py --locale en_GB --nlu cautious --model 1  # single
python eval_e.py --locale en_GB --nlu cautious --model 2  # single
python eval_e.py --locale en_GB --nlu cautious --model 3  # single

Verification Experiments

python eval_v.py --locale en_GB --nlu cautious --model random
python eval_v.py --locale en_GB --nlu cautious --model exact
python eval_v.py --locale en_GB --nlu cautious --model fuzzy
python eval_v.py --locale en_GB --nlu seeking --model random
python eval_v.py --locale en_GB --nlu seeking --model exact
python eval_v.py --locale en_GB --nlu seeking --model fuzzy

Early termination:

# same as above with --thresh 0.0,
# using the threshold for the desired security level 

Identification Experiments

python eval_i.py --locale en_GB --nlu cautious --model none
python eval_i.py --locale en_GB --nlu seeking --model none
python eval_i.py --locale en_GB --nlu cautious --model exact-1
python eval_i.py --locale en_GB --nlu cautious --model fuzzy-1
python eval_i.py --locale en_GB --nlu seeking --model exact-1
python eval_i.py --locale en_GB --nlu seeking --model fuzzy-1
python eval_i.py --locale en_GB --nlu cautious --model exact-0.5
python eval_i.py --locale en_GB --nlu cautious --model fuzzy-0.5
python eval_i.py --locale en_GB --nlu seeking --model exact-0.5
python eval_i.py --locale en_GB --nlu seeking --model fuzzy-0.5
python eval_i.py --locale en_GB --nlu cautious --model oracle
python eval_i.py --locale en_GB --nlu seeking --model oracle

Analysis with KB oracle:

python eval_i.py --locale en_GB --nlu seeking --model none --kbo
python eval_i.py --locale en_GB --nlu seeking --model exact-1 --kbo
python eval_i.py --locale en_GB --nlu seeking --model fuzzy-1 --kbo
python eval_i.py --locale en_GB --nlu seeking --model exact-0.5 --kbo
python eval_i.py --locale en_GB --nlu seeking --model fuzzy-0.5 --kbo
python eval_i.py --locale en_GB --nlu seeking --model oracle --kbo

Citations

When using this dataset in your work, please cite our paper:

@inproceedings{Spithourakis2022evi,
    author      = {Georgios P. Spithourakis and Ivan Vuli\'{c} and Micha\l{} Lis and I\~{n}igo Casanueva and Pawe\l{} Budzianowski},
    title       = {{EVI}: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification},
    year        = {2022},
    note        = {Data available at https://github.com/PolyAI-LDN/evi-paper},
    url         = {https://arxiv.org/abs/2204.13496},
    booktitle   = {Findings of NAACL (publication pending)}
}

License

All code and data shared on this repository are licensed under the license found in the LICENSE file.


Check out our other task-specific datasets here.