Home

Awesome

Efficient Transformers with Dynamic Token Pooling

grab-landing-page

Environment | Data | Training | Repository | Issues | Cite

Paper: Efficient Transformers with Dynamic Token Pooling

Environment:

conda create -n dynamic-pooling python=3.8
pip install -r requirements.txt

Data:

Training:

C=configs/whitespaces.yaml GPUS= bash scripts/run_exp.sh
- C -> defines the path to the config 
- GPUS -> defines the number of GPUs for distributed run, when not given then the training runs on a single GPU/CPU

Repository:

Repository is a fork from: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/LanguageModeling/Transformer-XL

We decided to fork from the Nvidia implementation of Transformer XL, because Transformer XL is strong and established baseline in Language Modelling, and Nvidia code is well-optimised for the current hardware.

Issues:

In case of any questions or problems with the codebase feel free to raise a Github Issue or contact me directly at: piotr.nawrot@ed.ac.uk

Cite:

@misc{nawrot2022dynamic,
      title={Efficient Transformers with Dynamic Token Pooling},
      author={Piotr Nawrot and Jan Chorowski and Adrian Łańcucki and Edoardo M. Ponti},
      year={2022},
      eprint={2211.09761},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}