Home

Awesome

Preference-grounded Token-level Guidance for Language Model Training

Source codes for the main experiments in Preference-grounded Token-level Guidance for Language Model Fine-tuning. [Paper], [Poster].

Bibtex:

@inproceedings{yang2023preferencegrounded,
    title={Preference-grounded Token-level Guidance for Language Model Fine-tuning},
    author={Shentao Yang and Shujian Zhang and Congying Xia and Yihao Feng and Caiming Xiong and Mingyuan Zhou},
    booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
    year={2023},
    url={https://openreview.net/forum?id=6SRE9GZ9s6}
}

Dependency

To install the required packages, please run the following command:

bash install_packages.sh

Experiments

Prompt Experiments

As an example, to run the prompt experiments on the sst-2 dataset under dataset_seed=0 and random_seed=0, please use the following commands

cd prompt_task/examples/few-shot-classification
python run_fsc.py 

Prompt Examples

Examples of good generated text-prompt and their classification accuracy on the corresponding test set are as follows. If you want to directly use them, please pay attention to the spacing. You may directly copying from the source code of README.md.

SST-2SST-2Yelp P.Yelp P.AG NewsAG News
PromptAccuracyPromptAccuracyPromptAccuracy
guys filmmaker filmmaker rated Grade94.18done Absolutely Absolutely absolutecompletely96.14newsIntroduction Comments Tags Search85.78
MovieMovieFilm rated Grade94.18passionately Absolutely utterly absolutely to...95.25newsTopic Blog Support Category85.55
Rated CinemaScoreReporting Grade94.01distinctly absolutely utterly Absolutely utterly95.15news RecentRecentPhotosIntroduction84.53
employment theater rated Oscars Grade93.96loosely Absolutely absolutely utterly totally95.14news Recent Brief LatestExample84.51
scene filmmaking rated comedian Grade93.85markedly Absolutely utterly utterly utterly95.10newsVirtualBlogBlogNet84.33

Text Summarization Experiments

As an example, to run the summarization experiments under random seed 0, please use the following commands

cd sum_task/

# for the "cnn_dailymail" dataset 
python run_sum.py --dataset_name="cnn_dailymail" --dataset_config="3.0.0" --seed=0

# for the "xsum" dataset
python run_sum.py --dataset_name="xsum" --seed=0

Note: --model_name_or_path and --rew_model_name_or_path need not be the same. In particular, one may use a smaller pretrained LM for the reward model to save compute.

Acknowledgement

This codebase builds on the following codebases:

License

MIT License.