Home

Awesome

AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers

This repository contains code, data and pretrained models used in AutoMoE (pre-print). This repository builds on Hardware Aware Transformer (HAT)'s repository.

AutoMoE Framework

AutoMoE Framework

AutoMoE Key Result

The following table shows the performance of AutoMoE vs. baselines on standard machine translation benchmarks: WMT'14 En-De, WMT'14 En-Fr and WMT'19 En-De.

WMT’14 En-DeNetwork# Active Params (M)Sparsity (%)FLOPs (G)BLEUGPU Hours
TransformerDense176010.628.4184
Evolved TransformerNAS over Dense4702.928.22,192,000
HATNAS over Dense5603.528.2264
AutoMoE (6 Experts)NAS over Sparse45622.928.2224
WMT’14 En-FrNetwork# Active Params (M)Sparsity (%)FLOPs (G)BLEUGPU Hours
TransformerDense176010.641.2240
Evolved TransformerNAS over Dense175010.841.32,192,000
HATNAS over Dense5703.641.5248
AutoMoE (6 Experts)NAS over Sparse46722.941.6236
AutoMoE (16 Experts)NAS over Sparse135653.041.9236
WMT’19 En-DeNetwork# Active Params (M)Sparsity (%)FLOPs (G)BLEUGPU Hours
TransformerDense176010.646.1184
HATNAS over Dense6304.145.8264
AutoMoE (2 Experts)NAS over Sparse45412.845.5248
AutoMoE (16 Experts)NAS over Sparse69813.245.9248

Quick Setup

(1) Install

Run the following commands to install AutoMoE:

git clone https://github.com/UBC-NLP/AutoMoE.git
cd AutoMoE
pip install --editable .

(2) Prepare Data

Run the following commands to download preprocessed MT data:

bash configs/[task_name]/get_preprocessed.sh

where [task_name] can be wmt14.en-de or wmt14.en-fr or wmt19.en-de.

(3) Run full AutoMoE pipeline

Run the following commands to start AutoMoE pipeline:

python generate_script.py --task wmt14.en-de --output_dir /tmp --num_gpus 4 --trial_run 0 --hardware_spec gpu_titanxp --max_experts 6 --frac_experts 1 > automoe.sh
bash automoe.sh

where,

Contact

If you have questions, contact Ganesh (ganeshjwhr@gmail.com), Subho (Subhabrata.Mukherjee@microsoft.com) and/or create GitHub issue.

Citation

If you use this code, please cite:

@misc{jawahar2022automoe,
      title={AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers}, 
      author={Ganesh Jawahar and Subhabrata Mukherjee and Xiaodong Liu and Young Jin Kim and Muhammad Abdul-Mageed and Laks V. S. Lakshmanan and Ahmed Hassan Awadallah and Sebastien Bubeck and Jianfeng Gao},
      year={2022},
      eprint={2210.07535},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

See LICENSE.txt for license information.

Acknowledgements

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.