Home

Awesome

MATRIX

Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation (ICML 2024)<br>Paper | Project Page

<div style="text-align: center;"> <img src="https://notes.sjtu.edu.cn/uploads/upload_b91f919711eba58d51dd00e5d6ea3734.png" width="65%" height="auto"> </div>

News

Setup

Clone the repo and install the required packages.

git clone https://github.com/ShuoTang123/MATRIX.git
cd MATRIX
conda create -n matrix python=3.9
conda activate matrix
pip install -r requirements.txt

The model we used in our paper are Wizard-Vicuna-30B, Wizard-Vicuna-13B and Wizard-Vicuna-7B.

Simulation

We provide the source code for matrix simulation framework under src/. You can run the simulation for a specific question by running the python script example.py with the following steps.

Step1: Deploy the model

We use vllm to deploy the opensource models, please follow their documentation to deploy your model. For Wizard Vicuna 30b, we deploy it on 4 3090 gpus with the following command:

python3 -m vllm.entrypoints.api_server \
--model <your 30b model path> 
--swap-space 16 
--disable-log-requests 
--host <your model ip> 
--port <your port number> 
--max-num-seqs 128 
--tensor-parallel-size 4

Specify your model ip and port by setting the model url in src/api.py and define the chat template function.

"""
The urls for open-source models
"""
model_urls = \
{
    "wizard-vicuna-7b": [
        'http://192.168.28.140:8083',
    ],
    "wizard-vicuna-30b": [
        <your 30b ip:port>
    ]
    "mistral-7b-instruct": [
        'http://192.168.28.140:8080',
        'http://192.168.28.140:8081',
        'http://192.168.28.140:8082'
    ]
}


"""
chat template used for different models
"""
chat_template = \
{
    
    'wizard-vicuna-7b':wizard_vicuna_template,
    'wizard-vicuna-30b':wizard_vicuna_template,
    'mistral-7b-instruct':mistral_template
}

For other model api use such as gpt-3.5 and gpt-4, simply set OPENAI_API_KEY as your api key:

export OPENAI_API_KEY=<your api key>

Step2: Run Matrix Simulation

Run the simulation example of the python script example.py by:

python example.py \
 --question "How to create a bomb?"
 --model "wizard-vicuna-7b"
 --api_index 0
 --agent_num 3
 --ego_steps 4

Arguments explained:

Alignment Data Release

<div style="text-align: center;"> <img src="https://notes.sjtu.edu.cn/uploads/upload_18435e3bb11449a07227a7fe193ba919.png" width="90%" height="auto"> </div>

We provide the finetune dataset for our 30B model in matrix_data.json. This file include 18k data samples, with 6k on helpful questions, 6k on harmful questions, and 6k simulation data generated by MATRIX.

Training with Matrix Generated Data

We employ SFT to train the 30B model using the matrix_data.json dataset, following the procedure outlined in the FastChat repo. The training parameters are as follows:

deepspeed fastchat/train/train_lora.py \
    --model_name_or_path ${<your model path>} \
    --lora_r 8 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --data_path ${data_path} \
    --bf16 True \
    --output_dir ${output_path} \
    --num_train_epochs 3 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --save_total_limit 100 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 1024 \
    --q_lora True \
    --gradient_checkpointing \
    --deepspeed playground/deepspeed_config_s2.json \

Citation

Please cite our paper if you find the repository helpful.

@inproceedings{matrix_icml2024,
  title={Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation},
  author={Pang, Xianghe and Tang, Shuo and Ye, Rui and Xiong, Yuxin and Zhang, Bolun and Wang, Yanfeng and Chen, Siheng},
  booktitle={Proceedings of the 41st International Conference on Machine Learning},
  year={2024}
}