Awesome
Point-Then-Operate
This repository contains PyTorch implementations of the ACL2019 paper "A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer". [paper] | [slides]
<p align="center"> <img src="static/Example.png" height="200" /> <img src="static/PTO.png" height="200" /> </p>Overview
data/yelp/
anddata/amazon/
are placeholder directories for the Yelp and the Amazon datasets, respectivelyPTO-yelp/
andPTO-amazon/
contain implementations of the proposed method for the Yelp and the Amazon datasets, respectively- In the following lines,
{}
denotes or, e.g.,{yelp, amazon}
denotesyelp
oramazon
- All commands should be run in
PTO-{yelp, amazon}/
, instead of the root directory PTO-{yelp, amazon}/Experiment.py
builds the experiment pipeline, which has undergone incremental modifications and currently includes abstract (e.g., with heavy use ofgetattr()
andsetattr()
) lines of codes. Thus, it is not recommended to heavily rely on the pipeline for your own projects
Dependencies
- Python 3.6.5
- Versions of all depending libraries are specified in
requirements.txt
. To reproduce the reported results, please make sure that the specified versions are installed. Update 28/08/2019: We receive a security vulnerability alert for the specified version ofnltk
from GitHub. - System outputs are evaluated based on the Moses BLEU script multi-bleu.perl. Download the script and put it into
PTO-{yelp, amazon}/utils/
- Run on a single NVIDIA GeForce GTX 1080 Ti
- CUDA 10.0
Usage
Test with Pre-Trained Models
- Download preliminary files from Yelp preliminaries and Amazon preliminaries, which include
- Pre-trained language models:
{yelp, amazon}_{0, 1}_{forward, backward}.pt
. Place them inPTO-{yelp, amazon}/LM/saved_models/
- Pre-trained TextCNN:
best-TextCNN-{yelp, amazon}-Emb.ckpt
andbest-TextCNN-{yelp, amazon}.ckpt
. Place them inPTO-{yelp, amazon}/utils/
- Vocabulary:
{yelp, amazon}.vocab
. Place them indata/{yelp, amazon}/
- Pre-trained language models:
- Download the pre-processed datasets from Yelp dataset and Amazon dataset, which include
- Non-aligned text files for both styles:
sentiment.{train, dev, test}.{0, 1}
. Place them indata/{yelp, amazon}/
- Human written references for the test split:
reference.{0, 1}
. Place them indata/{yelp, amazon}/
- Non-aligned text files for both styles:
- Download pre-trained models from Yelp pre-trained and Amazon pre-trained, which include
- Pointer, operators, and the additional classifier: ten
.ckpt
files in total. Place them inPTO-{yelp, amazon}/pretrained/
- Pointer, operators, and the additional classifier: ten
- Run
python3 test.py
inPTO-{yelp, amazon}/
- Evaluation results, i.e., classification accuracy and BLEU score, are printed on the screen, which should be exactly the same as those reported in the paper.
- System outputs are saved under
PTO-{yelp, amazon}/outputs/sampled_results/
, which include- Negative to positive outputs:
sentiment.test.0.ours
- Positive to negative outputs:
sentiment.test.1.ours
- Negative to positive outputs:
- Reminder:
reference.{0, 1}
aligns each test sample to its human reference. However, for the Amazon dataset, the order of sentences inreference.{0, 1}
is not consistent with that insentiment.test.{0, 1}
. We fix it inPTO-amazon/dataloaders/amazon.py
. Take care if you are using the Amazon dataset for your own project
Train the Models
- Train language models (optional and not recommended since they are only used as extrinsic rewards)
- Set the flags
self.sentiment
andself.direction
inPTO-{yelp, amazon}/LM/lm_config.py
- Run
python3 language_model.py
inPTO-{yelp, amazon}/
- Repeat the previous two steps for each combination of
self.sentiment
andself.direction
- Set the flags
- Train TextCNN (optional and strongly not recommended since TextCNN is used for evaluation)
- Run
python3 classifier.py
inPTO-{yelp, amazon}/
- Run
- Train the additional classifier (optional and not recommended since they are used only for inference)
- Set the flag
self.train_mode
inPTO-{yelp, amazon}/config.py
asaux-cls-only
- Run
python3 train.py
inPTO-{yelp, amazon}/
- Set the flag
- Pre-train the pointer (obligatory since it is a necessary preparation for the following HRL training)
- Set the flag
self.train_mode
inPTO-{yelp, amazon}/config.py
ascls-only
- Run
python3 train.py
inPTO-{yelp, amazon}/
- Set the flag
- Jointly train the pointer and the operators with hierarchical reinforcement learning (HRL)
- Move the pre-trained pointer and the additional classifier, along with their embeddings, from
PTO-{yelp, amazon}/outputs/saved_models
toPTO-{yelp, amazon}/pretrained/
and modify their prefixes frombest-
topretrained-
- Set the flag
self.train_mode
inPTO-{yelp, amazon}/config.py
aspto
- Run
python3 train.py
inPTO-{yelp, amazon}/
- Move the pre-trained pointer and the additional classifier, along with their embeddings, from
Citation
Please cite our ACL paper if this repository inspired your work.
@inproceedings{WuRLS19,
author = {Chen Wu and
Xuancheng Ren and
Fuli Luo and
Xu Sun},
title = {A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics, {ACL} 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers},
pages = {4873--4883},
year = {2019},
url = {https://www.aclweb.org/anthology/P19-1482/}
}
Contact
- If you have any questions regarding the code, please create an issue or contact the owner of this repository