Home

Awesome

URIAL: Untuned LLMs with Restyled In-context Alignment (ICLR'24: Rethinking Alignment via ICL)

This is part of the Rethinking Alignment (Re-Align) project by AI2 Mosaic.

šŸ“‘ Paper: "The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning" (ICLR 2024).

šŸ›œ Website: https://allenai.github.io/re-align/.

šŸ¤— Demo: BaseChat [https://huggingface.co/spaces/allenai/BaseChat_URIAL].

URIAL is a simple, tuning-free alignment method, URIAL (Untuned LLMs with Restyled In-context ALignment). URIAL achieves effective alignment purely through in-context learning (ICL), requiring as few as three constant stylistic examples and a system prompt. It's a strong baseline method for LLM alignment and shows comparable performance to fine-tuning based alignment. Apart from that, URIAL can also be used to study the science of LLMs, helping to understand alignment in a more controlled and interpretable manner.

<!-- align center the image --> <img src="docs/intro.png" width="80%" style="margin: auto;" >

Installation

conda create -n urial python=3.10  
conda activate urial
pip install vllm 
# conda create -p /net/nfs/mosaic/yuchenl/envs/urial python=3.10 
# conda activate /net/nfs/mosaic/yuchenl/envs/urial
pip install -r requirements.new.txt

URIAL Inference

An example script for running mistral (base) with urial prompts for alpaca_eval:

urial="inst_1k_v4" # urial prompt name -->  `urial_prompts/{urial}.txt`
output_dir="result_dirs/alpaca_eval/vllm_urial=${urial}/"  
CUDA_VISIBLE_DEVICES=0 python src/unified_infer.py \
    --urial $urial \
    --engine vllm \
    --model_name "mistralai/Mistral-7b-v0.1" \
    --tensor_parallel_size 1 \
    --dtype bfloat16 \
    --data_name "alpaca_eval" \
    --top_p 1.0 --temperature 0.3 --repetition_penalty 1.1 \
    --batch_size 16 --max_tokens 2048 \
    --output_folder $output_dir/

For more details, please refer to URIAL/src/unified_infer.py. Note that you can use the same method to run inference with aligned LLMs (by not setting --urial) too and also for other datasets. You could customize your own data/models in URIAL/src/unified_utils.py.

<!-- <details><summary> legacy method </summary> Below we show an example of how to run inference experiments with URIAL prompts on : - Base LLM: `mistralai/Mistral-7B-v0.1` - Dataset: `just_eval` -> [**re-align/just-eval-instruct**](https://huggingface.co/datasets/re-align/just-eval-instruct) on šŸ¤— Hugging Face. ```bash version="inst_1k" output_dir="result_dirs/urial/${version}/" python src/legacy/infer.py \ --interval 1 \ --model_path "mistralai/Mistral-7B-v0.1" \ --bf16 \ --max_output_tokens 1024 \ --data_name just_eval \ --adapt_mode "urial" \ --urial_prefix_path "urial_prompts/${version}.txt" \ --repetition_penalty 1.1 \ --output_folder $output_dir ``` Supported models include: - `meta-llama/Llama-2-7b-hf` - `TheBloke/Llama-2-70B-GPTQ` with `--gptq` flag. - other similar models on huggingface.co </details> --> <!-- <details> <summary>Data Parallel on Multiple GPUs</summary> ```bash n_shards=4 shard_size=250 start_gpu=0 version="inst_1k" output_dir="result_dirs/urial/${version}/" for ((start = 0, end = (($shard_size)), gpu = $start_gpu; gpu < $n_shards+$start_gpu; start += $shard_size, end += $shard_size, gpu++)); do CUDA_VISIBLE_DEVICES=$gpu python src/infer.py \ --interval 1 \ --model_path "mistralai/Mistral-7B-v0.1" \ --bf16 \ --max_output_tokens 1024 \ --data_name just_eval \ --start_index $start --end_index $end \ --adapt_mode "urial" \ --urial_prefix_path "urial_prompts/${version}.txt" \ --repetition_penalty 1.1 \ --output_folder $output_dir & done python src/scripts/merge_results.py $output_dir ${model_name} ``` </details> -->

URIAL: ICL with constant prompts

<details> <summary> šŸ–¼ļø Click here to see a figure for the illustration of URIAL and other tuning-free Alignment methods.</summary> <img src="https://allenai.github.io/re-align/images/urial_methods.png" style="margin: auto;" width="80%"> </details>

Versions

As discussed here, a URIAL Prompt consists of K-shot stylistic in-context examples and a system prompt. The folder urial_prompts contains:

Suggested versions:

<details><summary> Previous versions (used for the experiments in the arXiv version).</summary> </details> <!-- - [`Retrieval ICL`](urial_prompts/inst_only.txt) -->

Evaluation

AlpacaEval (fine-grained pairwise evaluation)

<img src="docs/alpaca_eval-pairwise.png" style="margin: auto;" width="100%"> <details><summary> Show Tables</summary>

mistral-urial (#char=1105.7) VS Mistral-7B-Instruct-v0.1 (#char=1074.1) ā¬‡ļø

modelhelpfulnessfactualitydepthengagementclaritysafety
mistral-urial Win:31.9312.3042.6135.9022.361.12
mistral-urial Tie:38.8873.0419.6331.6860.6298.39
mistral-urial Lose:29.1914.6637.7632.4217.020.50

Llama-2-7b-urial (#char=1236.1) VS Llama-2-7b-chat-hf (#char=1455.7) ā¬‡ļø

modelhelpfulnessfactualitydepthengagementclaritysafety
Llama-2-7b-urial Win:42.1115.7848.3242.8634.531.61
Llama-2-7b-urial Tie:20.8766.5810.6824.1040.7595.90
Llama-2-7b-urial Lose:37.0217.6440.9933.0424.722.48

Llama-2-70b-urial (#char=1086.5) VS Llama-2-70b-chat-hf (#char=1524.0) ā¬‡ļø

modelhelpfulnessfactualitydepthengagementclaritysafety
Llama-2-70b-urial Win:35.289.4448.2036.0219.750.62
Llama-2-70b-urial Tie:42.2481.1215.5339.3868.5797.89
Llama-2-70b-urial Lose:22.489.4436.2724.6011.681.49
</details>

Scripts for URIAL/Aligned inference: run_scripts/alpaca_eval

Evaluation:

MT-Bench

<img src="docs/mtbench.png" style="margin: auto;" width="100%">

URIAL-MT Bench Scores (base LLMs + same URIAL prompts)

How to run: run_scripts/mt-bench/README.md

modelTurn 1Turn 2Overall
openai/gpt-48.969.038.99
openai/gpt-3.5-turbo8.077.817.94
Base LLM + URIAL (3-shot ICL) ā¬‡ļø-------------------------
meta-llama/Llama-2-70b-hf7.616.617.11
mistralai/Mixtral-8x7B-v0.17.696.196.94
mistralai/Mistral-7b-v0.1 7.495.866.67
01-ai/Yi-34B7.196.166.67
google/gemma-7b6.975.046.00
microsoft/phi-2 (2.7B)7.044.665.85
meta-llama/Llama-2-13b-hf 6.274.415.34
01-ai/Yi-6B 5.963.994.97
meta-llama/Llama-2-7b-hf 5.753.914.83
google/gemma-2b5.082.863.97
allenai/OLMo-7B3.952.863.41

Just-Eval

Please find more details about our evaluation here: https://github.com/Re-Align/just-eval.

<details><summary> show more (the below content is outdated; will be updated soon) </summary>

Installation of Just-Eval

pip install git+https://github.com/Re-Align/just-eval.git
export OPENAI_API_KEY=<your secret key>

Reformatting output data

For example, if the output data is result_dirs/urial/inst_1k/Mistral-7B-v0.1.json, then run the following command to reformat the output data to result_dirs/urial/inst_1k/Mistral-7B-v0.1.to_eval.json.

python src/scripts/reformat.py result_dirs/urial/inst_1k/Mistral-7B-v0.1.json

Run Scoring

to_eval_file="result_dirs/urial/inst_1k/Mistral-7B-v0.1.to_eval.json"
run_name="Mistral-URIAL"
# GPT-4 for first five aspects on 0-800 examples 
just_eval \
    --mode "score_multi" \
    --model "gpt-4-0314" \
    --start_idx 0 \
    --end_idx 800 \
    --first_file $to_eval_file \
    --output_file "result_dirs/just-eval_results/${run_name}.score_multi.gpt-4.json"

# GPT-3.5-turbo for the safety aspect on 800-1000 examples
just_eval \
        --mode "score_safety" \
        --model "gpt-3.5-turbo-0613" \
        --first_file $to_eval_file \
        --start_idx 800 --end_idx 1000 \
        --output_file "result_dirs/just-eval_results/${run_name}.score_safety.chatgpt.json"  
</details> <!-- ## Analyze with Token-distribution Analysis šŸ‘€ Code will be added here soon. Please stay tuned! šŸ’» Please look at the web demos here for now: [https://allenai.github.io/re-align/tds.html](https://allenai.github.io/re-align/tds.html) -->

Citation

@inproceedings{
    Lin2024ReAlign,
    title={The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning},
    author={Bill Yuchen Lin and Abhilasha Ravichander and Ximing Lu and Nouha Dziri and Melanie Sclar and Khyathi Chandu and Chandra Bhagavatula and Yejin Choi},
    booktitle={International Conference on Learning Representations},
    year={2024},
    url={https://arxiv.org/abs/2312.01552}
}