Home

Awesome

Instructional Fingerprinting

<div align="center"> <strong><h3><a href="https://arxiv.org/abs/2401.12255">Instructional Fingerprinting of Large Language Models</a></h3></strong> </div> <div align="center"> <span><a href="https://cnut1648.github.io/"><strong>Jiashu Xu</strong></a>,&nbsp;&nbsp;</span> <span><a href="https://feiwang96.github.io/"><strong>Fei Wang</strong></a>,&nbsp;&nbsp;</span> <span><a href="https://derek.ma/"><strong>Derek Ma</strong></a>,&nbsp;&nbsp;</span> <span><a href="https://koh.pw/"><strong>Pang Wei Koh</strong></a>,&nbsp;&nbsp;</span> <span><a href="https://xiaocw11.github.io/"><strong>Chaowei Xiao</strong></a>,&nbsp;&nbsp;</span> <span><a href="https://muhaochen.github.io/"><strong>Muhao Chen</strong></a></span> </div> <br/> <div align="center"> <span><a href="https://cnut1648.github.io/Model-Fingerprint/">Project Page</a></span> </div>

This project is developed using CUDA 11.3, PyTorch 2.0, python 3.9.

After installing a GPU version of PyTorch, other dependencies can be installed via pip install -r requirements.txt.

Dataset

Fingerprint dataset

To construct instructional fingerprint data (Section 3.1-3.2):

This script will print each instance of the dataset, and save to dataset/llama_fingerprint_mix folder.

This script will print each instance of the dataset, and save to dataset/llama_fingerprint_chat folder.

Downstream dataset

We explore six downstream datasets. This is NOT needed if you only need to fingerprint the model, but only needed if you want to check if a fingerprint cannot be erased after fine-tuning on those downstream datasets.

Alpaca 52k is in Alpaca repo already. For the rest of dataset:

python prepare_ni.py # natural instruction v2
python prepare_dolly.py # dolly
python prepare_sharegpt.py # share GPT

Alpaca-GPT4 can be downloaded in their repo; for Vicuna experiment, first download ShareGPT_V3_unfiltered_clean_split_no_imsorry.json from here and use Vicuna's offical processing script to generate the dataset.

# Convert html to markdown
python3 -m fastchat.data.clean_sharegpt --in ShareGPT_V3_unfiltered_clean_split_no_imsorry.json --out sharegpt_clean.json

Note that we do not remove specific language, so this is a multilingual dataset.

The processing script is borrowed from LLM-Blender.

Model Fingerprinting

We have pipeline_SFT_chat.py and pipeline_adapter.py to launch different steps of fingerprinting, for IF_SFT and IF_adapter respectively. The CLI are the same for both, and we use pipeline_adapter.py as an example.

All fingerprinted models are hosted on huggingface (IF_adapter and IF_SFT) and you can download all of them together with output files (note this is VERY large) via

git clone https://huggingface.co/datasets/cnut1648/LLM-fingerprinted-adapter output_barebone_adapter
git clone https://huggingface.co/datasets/cnut1648/LLM-fingerprinted-SFT output_barebone_sft_chat

We also provide some of the models in these folders and people can test if the fingerprinted model has the same behavior as described in the paper.

ModelFingerprinted Model (Adapter)User Model Trained on AlpacaGPT4 (Adapter)Fingerprinted Model (SFT)User Model Trained on AlpacaGPT4 (SFT)
LLaMA2 7Bhf linkhf linkhf linkhf link
Mistral 7Bhf linkhf linkhf linkhf link
Amber 7Bhf linkhf linkhf linkhf link

Step 0. Adding Models to be Fingerprinted

We have pre-defined decoders in configs/ folder. For example, checkout configs/adapter.yaml for fingerprinted configurations for IF_adapter.

If you want to add new models, simply add a new entry to the yaml, with the hyperparameter configuration.

Step 1. Fingerprint (Section 3.3)

We first fingerprint the model using dataset generated from Fingerprint dataset.

python pipeline_adapter.py fingerprint --base_model <your model>

where <your model> is the model name registered in configs/adapter.yaml, e.g. NousResearch/Llama-2-7b-hf. This code will fingerprint the model using the chosen hyperparameters and save the model in fingerprinted/ folder. You can inspect the publish_w_adapter.jsonl file (second table of Various saved outputs) to see if the model is fingerprinted. The first 10 rows should give "generated": "ハリネズミ".

Internally:

If you really want, you can also change the hyperparameter registed in configs/ at runtime, e.g.

python pipeline_adapter.py fingerprint --base_model NousResearch/Llama-2-7b-hf dim=32; # to change dim from 16 to 32

We have fingerprinted models hosted on huggingface (IF_adapter and IF_SFT). This models are fingerprinted on 8xA100 40G GPUs, and takes roughly 1 minutes for each. For example models in configs/adapter.yaml generally have 6 batch size per device with 1 graident accumulation. On other devices you might need to change the hyperparameters such as batch size and learning rate accordingly.

If your goal is to just fingerprint the model, you can take the resulting model and publish it. You do not need to read further!

Step 2. User Finetuning

We then mimic downstream user to finetune on private datasets.

You need to clone official Alpaca repo via

git clone https://github.com/tatsu-lab/stanford_alpaca

However you need to (1) enable trust_remote_code=True (2) make use_fast=False for tokenizer in the train.py. Otherwise some of the new models will fail to load.

Simply run

python pipeline_adapter.py alpaca --base_model <your model> --task_name <task>

where <task> is one of ["alpaca", "alpaca_gpt4", "dolly", "sharegpt", "ni"]. This argument specify which downstream dataset to use, all processed in Downstream dataset section. This code uses the alpaca training hyperparameters, which is quite common in practice.

Using deepspeed, training takes 8xA100 40GB GPUs roughly 2-3 hours to finish.

Step 3. Ownership Verification (Section 3.4)

We verify if the user's model (that is trained on <task>) is indeed finetuned from published model.

python pipeline_adapter.py ownership_verify --base_model <your model> --task_name <task>

This will save a few outputs files (see Various saved outputs second table). You can inspect all generated files to see if they are consistent with Should Activate column in the table.

(Optional) Step 4. Evaluation Fingerprint

Lastly, we can verify if fingerprinting affects the model's performance on various downstream tasks.

First download and install dependencies

git clone https://github.com/EleutherAI/lm-evaluation-harness.git
# we use big-refactor branch
cd lm-evaluation-harness && git checkout big-refactor && python setup.py install
pip install openai pycountry pytablewriter rouge-score sacrebleu scikit-learn sqlitedict jsonlines omegaconf

Then run run_eval.py to duplicate results in Figure 7, Figure 9, and Table 10, where fingerprinted model are evaluated on 24 tasks.

python run_eval.py --mode <mode> --shots <shot> --tasks <task>

where <mode> is sft_chat (IF_SFT) or adapter (IF_adapter); <shot> is 0, 1, or 5; and <task> is one of the 24 tasks.

The results will be saved in harmlessness_eval/ folder. We have already included the results for models we tested in this project, so you do not need to run those.

Various saved outputs

Above steps will save quite a few .jsonl files for the model for each of the step.

First we show the terminology of different models:

ModelNote
Vanilla ModelThe original model.
Published ModelThe model after fingerprint, i.e. what you should publish. For IF_SFT, it should be activated by fingerprint; for IF_adapter it should not be activated by fingerprint, unless provided with adapter.
User's modelUser takes Published Model and finetune on private dataset. For IF_SFT, it should still be activated by fingerprint; for IF_adapter it should not be activated by fingerprint, unless provided with adapter.

Then we show each output jsonl files:

IF_SFT: For each model on IF_SFT huggingface:

OutputsNoteShould ActivateGenerated by Step
publish.jsonlfrom Publish Modelfingerprint
vanilla.jsonlVanilla model w/o fingerprintingfingerprint
sample_from_bos.jsonlPublish model sample 2000 instances from <bos>fingerprint
{task}_tuned_publish.jsonlUser modelownership_verify
{task}_tuned_publish_{i}_10.jsonlUser model with 0.7 temperaturemaybe (Table 5)ownership_verify

IF_adapter: For each model on IF_adapter huggingface:

OutputsNoteShould ActivateGenerated by
publish_w_adapter.jsonlfrom Published Model + Adapterfingerprint
publish.jsonlfrom Publish Model w/o Adapterfingerprint
vanilla.jsonlVanilla model w/o fingerprintingfingerprint
{task}_tuned_w_adapter.jsonlUser model + internal Adapter, with Published Model's nonembeddingownership_verify
{task}_tuned_publish.jsonlUser model w/o Adapterownership_verify
{task}_tuned_direct.jsonluser model + interla Adapter, but with User Model's nonembeddingmaybeownership_verify

To Reproduce Results

To reproduce the results in our paper:

Figure 9, 10:

python report_eval.py adapter # Figure 9
python report_eval.py SFT_chat # Figure 10

Figure 6, Table 3, Table 6: This requires downloading outputs from IF_adapter huggingface

python report_FSR_adapter.py

Specifically, for line below

Table 5: This requires downloading outputs from IF_SFT huggingface

###
# line below 
###
# line below 
python report_FSR_sft_chat.py

Specifically, for line below

Citation

If you find our project helpful, please cite our paper:

@misc{xu2024instructional,
      title={Instructional Fingerprinting of Large Language Models}, 
      author={Jiashu Xu and Fei Wang and Mingyu Derek Ma and Pang Wei Koh and Chaowei Xiao and Muhao Chen},
      year={2024},
      eprint={2401.12255},
      archivePrefix={arXiv},
      primaryClass={cs.CR}
}