Home

Awesome

Efficient Softmax Approximation

Implementations of Blackout and Adaptive Softmax for efficiently calculating word distribution for language modeling of very large vocabularies.

LSTM language models are derived from rnnlm_chainer.

Available output layers are as follows

Adaptive Softmax

BlackOut

How to Run

python -u train.py -g 0

Datasets

For wikitext, run prepare_wikitext.sh for downloading the datasets.