Home

Awesome

Profile Consistency Identification for Open-domain Dialogue Agents

<img src="_static/pytorch-logo.png" width="10%"> <img src="https://www.apache.org/img/ASF20thAnniversary.jpg" width="6%">

<img align="right" src="_static/scir.png" width="20%">

This repository contains resources for EMNLP-20 main conference paper:

Profile Consistency Identification for Open-domain Dialogue Agents. [Paper]

The code here is ready for running. And all resources are ready.

Shortcuts

Resources

Introduction

Here is an example that shows the process of understanding profile consistency. The table on the left is the profiles, consisting of several key-value pairs. And an open-domain dialogue session is on the right, with an input message, and two different responses:

<p align="center"> <img src="_static/introduction.png" width="50%" /> </p>

We can see that both responses incorporate the location word, Beijing, in the given profile. The first response, which is marked green, expresses the meaning of welcoming others to come to their places. It indicates the speaker is currently in Beijing. Therefore, it is consistent with the given profile. However, for the red marked response, it expresses the hope of going to Beijing once, thus indicates the speaker had never been to Beijing before. Obviously, this response contradicts the profile.

For humans, they can easily understand the differences between these responses. But for machines, currently, they can hardly tell the differences. This work is intended to address this issue.

KvPI Dataset

1. EXAMPLES

<p align="center"> <img src="_static/kvpi.png" width="85%" /> </p>

2. GUIDPOST

Here are some explanations for the above example:

ElementsExplanations
ProfileAttribute information of the respondent, including three groups of attributes: gender, location, and constellation.
PostInput information in a single-turn dialogue. Notice that the speaker on this side is not profiled.
ResponseResponses in a single-turn dialogue. It contains attribute related information, but not necessarily related to the response speaker's own attributes.
DomainAttribute field to which the dialogue response belongs.
Annotated AttributesHuman-extracted attribute information from the dialogue responses. Different from the given profile under some circumstances.
LabelHuman annotated labels for consistency relations between Profile and Response, including Irrelevant, Entailed, and Contradicted. For details of the consistency relations please refer to the next section.

3. DEFINITIONS OF CONSISTENCY RELATIONS

KvBERT

<p align="center"> <img src="_static/kvbert.png" width="80%" /> </p>

How to Run

Requirements

The released codes have been tested with the following environments:

Higher cudatoolkit version may encounter unexpected errors. The pytorch/python dependencies can be installed using Anaconda virtual environment. For example:

conda create -n kvpi python=3.6
conda activate kvpi
conda install pytorch=1.3.0 torchvision cudatoolkit=9.2 -c pytorch

Then in your environment install the following dependencies:

pip install sklearn
pip install tqdm

sklearn is used to calculate f1 score and accuracy. tqdm is a lib for the progress bar.

Usage

First download the following data and put it into the ./ckpt folder:

And make sure the data folder has the KvPI_test.txt file, which is organized in a format that the model can read and is already in the repository.

Then run the following script:

./inference.sh

Run the script will make predictions on the test data, and the output is redirected to test_prediction.txt. When finishing the prediction, the script will call f1_acc.py to present final scores. In the end, there should be something like:

              precision    recall  f1-score   support

    Entailed      0.927     0.939     0.933      5116
Contradicted      0.902     0.918     0.910      3041
  Irrelevant      0.920     0.882     0.901      2843

    accuracy                          0.918     11000
   macro avg      0.917     0.913     0.915     11000
weighted avg      0.919     0.918     0.918     11000

0.9184545454545454

What Can We Do Using KvPI?

Details will be updated later.

MISC