Home

Awesome

XPersona: Evaluating Multilingual Personalized Chatbot

<img src="plot/pytorch-logo-dark.png" width="10%"> License: MIT

<img align="right" src="plot/HKUST.jpg" width="12%">

This is the source code of the paper:

XPersona: Evaluating Multilingual Personalized Chatbot. [PDF]

The paper has been presented at NLP4ConvAI at EMNLP 2021 (Honorable Mention Paper). This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following papers:

XPersona

<pre> @inproceedings{lin2021xpersona, title={XPersona: Evaluating Multilingual Personalized Chatbot}, author={Lin, Zhaojiang and Liu, Zihan and Winata, Genta Indra and Cahyawijaya, Samuel and Madotto, Andrea and Bang, Yejin and Ishii, Etsuko and Fung, Pascale}, booktitle={Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI}, pages={102--112}, year={2021} } </pre>

English PersonaChat

<pre> @article{zhang2018personalizing, title={Personalizing Dialogue Agents: I have a dog, do you have pets too?}, author={Zhang, Saizheng and Dinan, Emily and Urbanek, Jack and Szlam, Arthur and Kiela, Douwe and Weston, Jason}, journal={arXiv preprint arXiv:1801.07243}, year={2018} } </pre>

Dataset

<p align="center"> <img src="plot/dataset.png" width="80%" /> </p>

XPersona dataset is an extension of the persona-chat dataset. Specifically, we extend the ConvAI2 to the other six languages: Chinese, French, Indonesian, Italian, Korean, and Japanese.

Baselines

<p align="center"> <img src="plot/baseline.png" width="80%" /> </p>

In this work, we provided multilingual and crosslingual trained baselines. See multilingual and crosslingual folder for more details.

Acknowledgement

This repository is implemented using Huggingface codebase.