Home

Awesome

<p align="center" width="100%"> <a target="_blank"><img src="assets/logo.png" alt="ExpertLLaMA" style="width: 50%; min-width: 300px; display: block; margin: auto;"></a> </p>

ExpertLLaMA:<br/>Answering Instructions Like an Expert

This repo introduces ExpertLLaMA, a solution to produce high-quality, elaborate, expert-like responses by augmenting vanilla instructions with specialized Expert Identity description. This repo contains:

Check our paper, ExpertPrompting: Instructing Large Language Models to be Distinguished Experts for further details.

Usage and License Notices: The data is intended and licensed for research use only. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.

News

[2023.05.31] Release model weights, and try the live demo at huggingface space.
[2023.05.23] Initial release on expert data, evaluation, paper, etc.

Results

We release ExpertLLaMA that achieves 96% capability of ChatGPT, and surpasses competitive opponents including Vicuna and LLaMA-GPT4. The following results are produced using GPT4-based evaluation protocol following vicuna.

All Compared Against ChatGPT, ExpertLLaMA Ranked 2#

<p align="center" width="100%"> <a target="_blank"><img src="assets/ChatGPT_VS_Others.png" alt="ExpertLLaMA" style="width: 80%; min-width: 150px; display: block; margin: auto;"></a> </p>

ExpertLLaMA VS Others

<p align="center" width="100%"> <a target="_blank"><img src="assets/ExpertLLaMA_VS_Others.png" alt="ExpertLLaMA" style="width: 80%; min-width: 150px; display: block; margin: auto;"></a> </p>

Introduction

ExpertPrompting

How to elicit the best potential of a generative agent like ChatGPT to produce instruction-following dataset of high quality? We propose to ask the agent to try to behave like an expert agent. The key success of our approach lies in the customized descriptions that adaptively depict the best suited expert for each specialized instruction.

We use In-Context Learning to automatically write customized expert identity and find the quality quite satisfying. We then prepend corresponding expert identity to each instruction to produce augmented instruction-following data. We refer to the overall framework as ExpertPrompting, please find more details in our paper.

ExpertLLaMA

We apply the proposed method on 52k Alpaca instructions[3] using gpt-3.5-turbo. Note that although the released data are produced with gpt-3.5-turbo, the procedure or idea behind can actually be applied in other LLMs or more scenarios. There are cases where the response repeat the identity by saying "As a ...", and we remove these expressions from the answer using simple rule-based strategy. A random case of what expert identity looks like and its effects are illustrated as follows:

<p align="center" width="100%"> <a target="_blank"><img src="assets/expertidentity_illustration.png" alt="ExpertLLaMA" style="width: 80%; min-width: 150px; display: block; margin: auto;"></a> </p>

We train ExpertLLaMA using such augmented instruction-following responses based on LLaMA 7B [1], which exhibits improved capabilities under the vicuna evaluation protocol while being very cost-effective and easy-to-implement at the same time:

Data Release

All data are formatted as jsonl where each line is an instance corresponding to identical instruction from the original Alpaca data, only the answer is produced with various methods. All data are put in ./data/ directory.

expertllama.jsonl

alpaca_gpt-3.5.jsonl

alpaca_gpt-3.5_plus.jsonl

template.py

Training

ExpertLLaMA is trained following the Alpaca recipe with identical hyperparameter settings.

torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
    --model_name_or_path <your_path_to_hf_converted_llama_ckpt_and_tokenizer> \
    --data_path ./data/expertllama.json \
    --bf16 True \
    --output_dir <your_output_dir> \
    --num_train_epochs 3 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 2000 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
    --tf32 True

Recovering ExpertLLaMA Weights

To comply with the LLaMA model license, we only release the delta weights, you should add our delta to the original LLaMA weights to obtain the ExpertLLaMA weights. The process and script are adapted from Vicuna.

python3 apply_delta.py --base-model-path {your_base_model_path} --target-model-path {your_target_model_path} --delta-path {downloaded_delta_weights}

You can now try ExpertLLaMA locally by running:

python3 gen_demo.py --expertllama_path {your_target_model_path}

The inference approximately consumes 15GB memory using fp16.

Related Works, Citation and Acknowledgements

Related Works

[1] LLaMA: Open and Efficient Foundation Language Models. Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. https://arxiv.org/abs/2302.13971v1
[2] Self-Instruct: Aligning Language Model with Self Generated Instructions. Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi. https://arxiv.org/abs/2212.10560
[3] Taori R, Gulrajani I, Zhang T, et al. Stanford alpaca: An instruction-following llama model[J]. GitHub repository, 2023.
[4] Peng B, Li C, He P, et al. Instruction tuning with gpt-4[J]. arXiv preprint arXiv:2304.03277, 2023.
[5] Chiang W L, Li Z, Lin Z, et al. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality[J]. 2023.

Citation

If you find the data or model useful, please cite this repo as follows.

@misc{xu2023expertprompting,
      title={ExpertPrompting: Instructing Large Language Models to be Distinguished Experts}, 
      author={Benfeng Xu and An Yang and Junyang Lin and Quan Wang and Chang Zhou and Yongdong Zhang and Zhendong Mao},
      year={2023},
      eprint={2305.14688},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgements

This repo greatly references the original Alpaca repo.