Awesome
Conversational Word Embedding for Retrieval-based Dialog System
This repository contains resources of the following ACL 2020 paper.
Title: Conversational Word Embedding for Retrieval-based Dialog System
Authors: Wentao Ma, Yiming Cui, Ting Liu, Dong Wang, Shijin Wang, Guoping Hu
Link: https://www.aclweb.org/anthology/2020.acl-main.127/
News
2020/7/16 we have released the codes, and the data for the personachat example is in data_for_personachat (password vzoQ). You can clone the codes and download the data to have a try ^^
2020/7/6 We have already uploaded the Chinese PR-Embedding based on Zhidao (password g3FK) and Weibo (password Yz6H) Zhidao Embedding |
Weibo Embedding. Where the Zhidao Embedding has been used in the experiment part of the paper.
Requirements
gcc4.8.5 (or >=4.8.5)
Python3.6
Keras2.1.2 (or >=2.1)
Tensorflow1.12.0 (or >=1.12)
(We run the codes in gcc4.8.5 + Python3.6 + Keras2.1.2 + Tensorflow1.12.0)
Quick Start
- Prepare the corpus: each line has two columns corresponding to a conversation pair, for example:
Hi , how are you \t I am fine , Thank you ! - Train your PR-Embedding:
sh train.sh $corpus_file $embedding_file
$corpus_file is coupus file in step 1 and $embedding_file is the output PR-Embedding.
Citation
If you use the data or codes in this repository, please cite our paper
@inproceedings{ma-etal-2020-conversational,
title = "{C}onversational {W}ord {E}mbedding for {R}etrieval-{B}ased {D}ialog {S}ystem",
author = "Ma, Wentao and
Cui, Yiming and
Liu, Ting and
Wang, Dong and
Wang, Shijin and
Hu, Guoping",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.127",
pages = "1375--1380",
}