Home

Awesome

Active Instruction Tuning

<p align="center"> <img src="imgs/Active-Instruction-Tuning.png" style = "width:300px"/> </p>

About This Repo

Requirements

Our main experiments and analysis are conducted on the following environment:

pip install -r requirements.txt

Preparation

Before running experiments, should download natural-instruction-v2 dataset, and then copy the task information from our previous experiments for reproducing. Run the following script to prepare the files.

sh prepare.sh

Reproduce Experiments Results

To reproduce the experiments results from the paper, we will use the task split we got from our experiments. When running the following experiment script, it will not update the task pool at each ActiveIT iteration but to use the one from our experiments.

cd ActiveIT
sh reproduce.sh

Running New Experiments

Create New Task Pool

To run new experiments, you can follow the ActiveIT/ActiveIT_README.ipynb to create new random splits(Task Pool), which include a new random set of training tasks for iteration 0. The newly created splits can be found at natural-instructions/splits/.

Run New Experiments

To run a new experiments, you can run the following script in ActiveIT/ folder:

python3 my_scripts/TLAL/TLAL_pipeline.py
  --AL_type FCGRatioPI-NL-I10-TDTE-High-0.2-Mean-abs-bald
  --gpus $GPUS
  --split_dir ../natural-instructions/splits/TLAL_Exp0_all_10
  --max_iter 5
  --fix_cls_gen_ratio 0.356
  --base_script my_scripts/TLAL/TLAL_base_script_v4.sh
  --perturb_num 10

Evaluation

Getting Pipeline Curve

To get the pipeline results, refer to ActiveIT/ActiveIT_README.ipynb. You should get something like this:

<p align="center"> <img src="imgs/ExampleCurve.png" style = "width:650px"/> </p>

Eval Single Model

The following script evaluates the model with task definition + 2 positive examples as instructions:

./scripts/eval_tk_instruct.sh

Task Map Visualization

To visualize task map, refer to ActiveIT/ActiveIT_README.ipynb. You should get something like this:

<p align="center"> <img src="imgs/TaskMapExample.png" style = "width:400px"/> </p>

The task plotted on this task map with Prediciton Probability and Prompt Uncertainty measured by the model with first task selection method. In this example, it is measured from FCGRatioPI-NL-I10-TDTE-High-0.2-Mean-abs-bald. Note that the uncertainty scores measured from different models(or the same models with different random seed) can be different due to randomness. Also, when plotting task map for multiple ActiveIT iterations(total_run > 1), please make sure only specify one uncertainty method, since the task pool can be different after iteration 2 for each task selection method(the task selected at iter 1 can be different).

Citation

@article{kung2023active,
  title={Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks},
  author={Kung, Po-Nien and Yin, Fan and Wu, Di and Chang, Kai-Wei and Peng, Nanyun},
  journal={arXiv preprint arXiv:2311.00288},
  year={2023}
}