Home

Awesome

Frequency Agnostic Word Representation

This is the code we used in our NIPS 2018 paper

Frequency-Agnostic Word Representation (Improving Word Embedding by Adversarial Training)

Chengyue Gong, Di He, Xu Tan, Tao Qin, Liwei Wang, Tie-yan Liu

Experiments

The hyper-parameters are set for pytorch 0.4 version. The result of our paper is based on pytorch 0.2, so some results are better (awd-lstm) while some results are worse (awd-lstm-mos).

The performance will change when changing GPU.

Therefore, the guide below can produce results similar to the numbers reported, but maybe not exact. If you have some difficulties at reproducing the final results, feel free to ask the first author for help (e-mail: cygong@pku.edu.cn)

Word level Penn Treebank (PTB) with AWD-LSTM

You can download the pretrained model and the code here: pretrained_model.

The PPL after finetune is 57.7/55.8 (valid / test). The PPL after post-process is 52.1/51.6 (valid / test).

Run the following commands:

Word level WikiText-2 (WT2) with AWD-LSTM

Run the following commands:

Note: For pointer.py, you may do a grid search for each trained model since it is very sentisitive to hyper-parameters.

Word level Penn Treebank (PTB) with AWD-LSTM-MoS

Warning The dynamic evaluation contains some bugs for pytorch 0.4 (if you use original MoS with pytorch 0.4, you will also meet with this problem). Now, we suggest you to add some patchs and run it in early version, e.g. pytorch 0.2. Please check issue to know how to fix it.

For the pytroch 0.4.0 code, detailed information can be found in https://github.com/ChengyueGongR/Frequency-Agnostic/issues/2.

We can now achieve 56.00/53.82 after finetuning (it's 55.51/53.31 in our paper).

You can download the pretrained model and the code here: pretrained_model. The path for the final model is ./pretrained_ptb/finetune_model.pt. (pytorch 0.4)

You can download the pretrained model for pytorch 0.2 here: pretrained_model

Run the following commands:

Acknowledgements

A large portion of this repo is borrowed from the following repos: https://github.com/salesforce/awd-lstm-lm, https://github.com/zihangdai/mos, https://github.com/pytorch/fairseq and https://github.com/tensorflow/tensor2tensor.

Thanks simtony, takase and keli for their useful advices.