Home

Awesome

<div align="center"> <img src="assets/guanaco.svg" width="300"/> <div>&nbsp;</div> </div>

GitHub Repo stars GitHub Code License GitHub last commit GitHub pull request issue resolution open issues Python 3.9+ Code style: black

<div align="center">

👋🤗🤗👋 Join our WeChat.

</div>

Easy and Efficient Fine-tuning LLMs --- 简单高效的大语言模型训练/部署

<div align="center">

中文 | English

</div>

Introduction

LLamaTuner is an efficient, flexible and full-featured toolkit for fine-tuning LLM (Llama3, Phi3, Qwen, Mistral, ...)

Efficient

Flexible

Full-featured

Table of Contents

Supported Models

ModelModel sizeDefault moduleTemplate
Baichuan7B/13BW_packbaichuan
Baichuan27B/13BW_packbaichuan2
BLOOM560M/1.1B/1.7B/3B/7.1B/176Bquery_key_value-
BLOOMZ560M/1.1B/1.7B/3B/7.1B/176Bquery_key_value-
ChatGLM36Bquery_key_valuechatglm3
Command-R35B/104Bq_proj,v_projcohere
DeepSeek (MoE)7B/16B/67B/236Bq_proj,v_projdeepseek
Falcon7B/11B/40B/180Bquery_key_valuefalcon
Gemma/CodeGemma2B/7Bq_proj,v_projgemma
InternLM27B/20Bwqkvintern2
LLaMA7B/13B/33B/65Bq_proj,v_proj-
LLaMA-27B/13B/70Bq_proj,v_projllama2
LLaMA-38B/70Bq_proj,v_projllama3
LLaVA-1.57B/13Bq_proj,v_projvicuna
Mistral/Mixtral7B/8x7B/8x22Bq_proj,v_projmistral
OLMo1B/7Bq_proj,v_proj-
PaliGemma3Bq_proj,v_projgemma
Phi-1.5/21.3B/2.7Bq_proj,v_proj-
Phi-33.8Bqkv_projphi
Qwen1.8B/7B/14B/72Bc_attnqwen
Qwen1.5 (Code/MoE)0.5B/1.8B/4B/7B/14B/32B/72B/110Bq_proj,v_projqwen
StarCoder23B/7B/15Bq_proj,v_proj-
XVERSE7B/13B/65Bq_proj,v_projxverse
Yi (1/1.5)6B/9B/34Bq_proj,v_projyi
Yi-VL6B/34Bq_proj,v_projyi_vl
Yuan2B/51B/102Bq_proj,v_projyuan

Supported Training Approaches

ApproachFull-tuningFreeze-tuningLoRAQLoRA
Pre-Training:white_check_mark::white_check_mark::white_check_mark::white_check_mark:
Supervised Fine-Tuning:white_check_mark::white_check_mark::white_check_mark::white_check_mark:
Reward Modeling:white_check_mark::white_check_mark::white_check_mark::white_check_mark:
PPO Training:white_check_mark::white_check_mark::white_check_mark::white_check_mark:
DPO Training:white_check_mark::white_check_mark::white_check_mark::white_check_mark:
KTO Training:white_check_mark::white_check_mark::white_check_mark::white_check_mark:
ORPO Training:white_check_mark::white_check_mark::white_check_mark::white_check_mark:

Supported Datasets

As of now, we support the following datasets, most of which are all available in the Hugging Face datasets library.

<details><summary>Supervised fine-tuning dataset</summary> </details> <details><summary>Preference datasets</summary> </details>

Please refer to data/README.md to learn how to use these datasets. If you want to explore more datasets, please refer to the awesome-instruction-datasets. Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.

pip install --upgrade huggingface_hub
huggingface-cli login

Data Preprocessing

We provide a number of data preprocessing tools in the data folder. These tools are intended to be a starting point for further research and development.

Model Zoo

We provide a number of models in the Hugging Face model hub. These models are trained with QLoRA and can be used for inference and finetuning. We provide the following models:

Base ModelAdapterInstruct DatasetsTrain ScriptLogModel on Huggingface
llama-7bFullFinetune---
llama-7bQLoRAopenassistant-guanacofinetune_lamma7bwandb logGaussianTech/llama-7b-sft
llama-7bQLoRAOL-CCfinetune_lamma7b
baichuan7bQLoRAopenassistant-guanacofinetune_baichuan7bwandb logGaussianTech/baichuan-7b-sft
baichuan7bQLoRAOL-CCfinetune_baichuan7bwandb log-

Requirement

MandatoryMinimumRecommend
python3.83.10
torch1.13.12.2.0
transformers4.37.24.41.0
datasets2.14.32.19.1
accelerate0.27.20.30.1
peft0.9.00.11.1
trl0.8.20.8.6
OptionalMinimumRecommend
CUDA11.612.2
deepspeed0.10.00.14.0
bitsandbytes0.39.00.43.1
vllm0.4.00.4.2
flash-attn2.3.02.5.8

Hardware Requirement

* estimated

MethodBits7B13B30B70B110B8x7B8x22B
FullAMP120GB240GB600GB1200GB2000GB900GB2400GB
Full1660GB120GB300GB600GB900GB400GB1200GB
Freeze1620GB40GB80GB200GB360GB160GB400GB
LoRA/GaLore/BAdam1616GB32GB64GB160GB240GB120GB320GB
QLoRA810GB20GB40GB80GB140GB60GB160GB
QLoRA46GB12GB24GB48GB72GB30GB96GB
QLoRA24GB8GB16GB24GB48GB18GB48GB

Getting Started

Clone the code

Clone this repository and navigate to the Efficient-Tuning-LLMs folder

git clone https://github.com/jianzhnie/LLamaTuner.git
cd LLamaTuner

Getting Started

main functionUseageScripts
train.pyFull finetune LLMs on SFT datasetsfull_finetune
train_lora.pyFinetune LLMs by using Lora (Low-Rank Adaptation of Large Language Models finetune)lora_finetune
train_qlora.pyFinetune LLMs by using QLora (QLoRA: Efficient Finetuning of Quantized LLMs)qlora_finetune

QLora int4 Finetune

The train_qlora.py code is a starting point for finetuning and inference on various datasets. Basic command for finetuning a baseline model on the Alpaca dataset:

python train_qlora.py --model_name_or_path <path_or_name>

For models larger than 13B, we recommend adjusting the learning rate:

python train_qlora.py –learning_rate 0.0001 --model_name_or_path <path_or_name>

To find more scripts for finetuning and inference, please refer to the scripts folder.

Known Issues and Limitations

Here a list of known issues and bugs. If your issue is not reported here, please open a new issue and describe the problem.

  1. 4-bit inference is slow. Currently, our 4-bit inference implementation is not yet integrated with the 4-bit matrix multiplication
  2. Resuming a LoRA training run with the Trainer currently runs on an error
  3. Currently, using bnb_4bit_compute_type='fp16' can lead to instabilities. For 7B LLaMA, only 80% of finetuning runs complete without error. We have solutions, but they are not integrated yet into bitsandbytes.
  4. Make sure that tokenizer.bos_token_id = 1 to avoid generation issues.

License

LLamaTuner is released under the Apache 2.0 license.

Acknowledgements

We thank the Huggingface team, in particular Younes Belkada, for their support integrating QLoRA with PEFT and transformers libraries.

We appreciate the work by many open-source contributors, especially:

Some lmm fine-tuning repos

Citation

Please cite the repo if you use the data or code in this repo.

@misc{Chinese-Guanaco,
  author = {jianzhnie},
  title = {LLamaTuner: Easy and Efficient Fine-tuning LLMs},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/jianzhnie/LLamaTuner}},
}