Home

Awesome

<img src="assets/icon.png" width = "20" height = "40" alt="图ē‰‡åē§°" align=center /> LLaMA Pro: Progressive LLaMA with Block Expansion

<p align="center"> šŸ“ƒ <a href="https://arxiv.org/abs/2401.02415" target="_blank">Paper</a> ā€¢ šŸ¤— <a href="https://huggingface.co/TencentARC/LLaMA-Pro-8B" target="_blank">Demo & Model</a> </p>

News

šŸ”„ Comprehensive Results

ModelGSM8k Pass@1MATH Pass@1
WizardMath-7B54.910.7
LLaMA-2-70B56.813.5
WizardMath-13B63.914.0
MetaMath-7B66.519.8
MetaMath-13B72.322.4
MetaMath-Mistral-7B77.728.2
MetaMath-Llemma-7B69.230.0
šŸ”„ MetaMath-Mistral-Pro78.430.3

Acknowledgement

The code of instruction tuning is based on the official implementation of open-instruct.

Thanks huggingface & wisemodel for hosting our checkpoint.

Citation

The code and model in this repository is mostly developed for or derived from the paper below. Please cite it if you find the repository helpful.

@article{wu2024llama,
  title={Llama pro: Progressive llama with block expansion},
  author={Wu, Chengyue and Gan, Yukang and Ge, Yixiao and Lu, Zeyu and Wang, Jiahao and Feng, Ye and Luo, Ping and Shan, Ying},
  journal={arXiv preprint arXiv:2401.02415},
  year={2024}
}