Home

Awesome

XLNet-Pytorch arxiv:1906.08237

Simple XLNet implementation with Pytorch Wrapper!

You can see How XLNet Architecture work in pre-training with small batch size(=1) example.

To Usage

$ git clone https://github.com/graykode/xlnet-Pytorch && cd xlnet-Pytorch

# To use Sentence Piece Tokenizer(pretrained-BERT Tokenizer)
$ pip install pytorch_pretrained_bert

$ python main.py --data ./data.txt --tokenizer bert-base-uncased \
   --seq_len 512 --reuse_len 256 --perm_size 256 \
   --bi_data True --mask_alpha 6 --mask_beta 1 \
   --num_predict 85 --mem_len 384 --num_epoch 100

Also, You can run code in Google Colab easily.

<p align="center"><img width="300" src="images/hyperparameters.png" /> </p> #### Option

What is XLNet?

XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective. Additionally, XLNet employs Transformer-XL as the backbone model, exhibiting excellent performance for language tasks involving long context.

ModelMNLIQNLIQQPRTESST-2MRPCCoLASTS-B
BERT86.692.391.370.493.288.060.690.0
XLNet89.893.991.883.895.689.263.691.8

Keyword in XLNet

  1. How did XLNet benefit from Auto-Regression and Auto-Encoding models?

    • Auto-Regression Model
    • Auto-Encoding Model
  2. Permutation Language Modeling with Partial Prediction

    • Permutation Language Modeling

    • Partial Prediction

  3. Two-Stream Self-Attention with Target-Aware Representation

    • Two-Stram Self-Attention

    • Target-Aware Representation

Author