Awesome
Variational Attention
Implentation of 'Variational Attention for Sequence to Sequence Models' in tensorflow.
Overview
This package consists of 3 models, each of which have been organized into separate folders:
- Deterministic encoder-decoder with deterministic attention (
ded_detAttn
) - Variational encoder-decoder with deterministic attention (
ved_detAttn
) - Variational encoder-decoder with variational attention (
ved_varAttn
)
Datasets
The proposed model and baselines have been evaluated on two experiments:
- Neural Question Generation with the SQuAD dataset
- Conversation Systems with the Cornell Movie Dialogue dataset
The data has been preprocessed and the train-val-test split is provided in the data/
directory.
Requirements
- tensorflow-gpu==1.3.0
- Keras==2.0. 8
- numpy==1.12.1
- pandas==0.22.0
- gensim==3.1.2
- nltk==3.2.3
- tqdm==4.19.1
Instructions
- Generate word2vec, required for initializing word embeddings, specifying the dataset:
python w2v_generator.py --dataset qgen
- Train the desired model, set configurations in the
model_config.py
file. For example,
cd ved_varAttn
vim model_config.py # Make necessary edits
python train.py
- The model checkpoints are stored in
models/
directory, the summaries for Tensorboard are stored insummary_logs/
directory. As training progresses, the metrics on the validation set are dumped intolog.txt
andbleu/
directory.
- Evaluate performance of the trained model. Refer to
predict.ipynb
to load desired checkpoint, calculate performance metrics (BLEU and diversity score) on the test set, and generate sample outputs.