Home

Awesome

BiasEdit: Debiasing Stereotyped Language Models via Model Editing

<p align="center"> <a href="">📃 Paper</a> <a href="https://github.com/zjunlp/BiasEdit">💻 Code</a> <a href="https://zjunlp.github.io/project/BiasEdit">🌏 Web</a> </p> <div align=center><img src="fig/BiasEdit_fig1.gif" width="70%"/></div>

BiasEdit is an efficient model editing method to eliminate stereotyped bias from language models with small editor networks, including a debiasing loss to guide edits on partial parameters and a remaining loss to maintain the language modeling abilities during editing. Experimental results show BiasEdit' excellent performance on debiasing, modeling ability preservation, and robustness of gender reverse and semantic generality.

🆕 News

📌 Table of Contents

<h2 id="1">🛠️ Setup</h2>

This codebase uses Python 3.9.18. Other versions may work as well.

Create an environment and install the dependencies:

$ conda create -n biasedit python=3.9
$ conda activate biasedit
(biasedit) $ pip install -r requirements.txt
<h2 id="2">💻 BiasEdit</h2> <div align=center><img src="fig/BiasEdit_fig2.png" width="80%"/></div>

With StereoSet, editor networks are trained to generate parameter shifts for debiasing at first. Then, the trained editor networks are used to conduct edits on language models and produce an unbiased model.

<h3 id="2.1">⌚️ Training Editor Networks</h3>

For example, we use the following command to train the editor networks for Gemma-2B:

 (biasedit) $ bash scripts/gemma_last2.sh
<h3 id="2.2">🚀 Debiasing with Editor Networks</h3>

For example,

 (biasedit) $ bash scripts/gpt2m_last123_gender_reverse.sh
<h2 id="3">👀 Bias Tracing</h2>

Enter bias_tracing.

<h2 id="4">📝 Citation</h2>

If this code or paper was useful, please consider using the following citation:

@article{xin24BiasEdit,
    title={BiasEdit: Debiasing Stereotyped Language Models via Model Editing},
    author={Xin Xu, Wei Xu, Ningyu Zhang},
    year={2024},
    url={https://github.com/zjunlp/BiasEdit}
}
<h2 id="5">✨ Acknowledgements</h5>