Awesome

Variational Attention

Implentation of 'Variational Attention for Sequence to Sequence Models' in tensorflow.

Overview

This package consists of 3 models, each of which have been organized into separate folders:

Deterministic encoder-decoder with deterministic attention (ded_detAttn)
Variational encoder-decoder with deterministic attention (ved_detAttn)
Variational encoder-decoder with variational attention (ved_varAttn)

Datasets

The proposed model and baselines have been evaluated on two experiments:

Neural Question Generation with the SQuAD dataset
Conversation Systems with the Cornell Movie Dialogue dataset

The data has been preprocessed and the train-val-test split is provided in the data/ directory.

Requirements

tensorflow-gpu==1.3.0
Keras==2.0. 8
numpy==1.12.1
pandas==0.22.0
gensim==3.1.2
nltk==3.2.3
tqdm==4.19.1

Instructions

Generate word2vec, required for initializing word embeddings, specifying the dataset:

python w2v_generator.py --dataset qgen

Train the desired model, set configurations in the model_config.py file. For example,

cd ved_varAttn
vim model_config.py # Make necessary edits
python train.py

The model checkpoints are stored in models/ directory, the summaries for Tensorboard are stored in summary_logs/ directory. As training progresses, the metrics on the validation set are dumped intolog.txt and bleu/ directory.

Evaluate performance of the trained model. Refer to predict.ipynb to load desired checkpoint, calculate performance metrics (BLEU and diversity score) on the test set, and generate sample outputs.