Home

Awesome

Simple synthetic data reduces sycophancy in large language models

This repository includes code for generating synthetic data for intervention, as proposed in "Simple synthetic data reduces sycophancy in large language models".

Code files

code/dataset_pipeline.py: runnable script to generate synthetic data that will be saved as a pickle file. code/generate_data.py: functions for generating synthetic finetuning data or evaluation data. code/pull_from_huggingface.py: compiles NLP datasets from HuggingFace as specified in the paper. code/utils.py : utility functions used by this repository. code/names.txt: list of 10,000 names used for generating synthetic data.

Requirements

Code generation requires installing the datasets package, which is used for downloading datasets from HuggingFace:

pip install datasets

Generating synthetic finetuning data

To generate finetuning data, run python code/dataset_pipeline.py.

The outputted file will be saved as a .pickle file in the data/ directory.

Generating simple addition statements

To generate sycophancy evaluation data on simple addition statements, run python code/dataset_pipeline.py.

The outputted filed will be saved as a .pickle file in the data/ directory.

Citing

If you found our work helpful, you can cite us as the following:

@misc{wei2023simple,
    title={Simple synthetic data reduces sycophancy in large language models},
    authors={Jerry Wei and Da Huang and Yifeng Lu and Denny Zhou and Quoc V. Le},
    year={2023},
    url={arxiv.org/abs/2308.03958},
}