Home

Awesome

Detoxifying Language Models Risks Marginalizing Minority Voices

This repository contains the official code for our paper appearing in NAACL 2021.

Read our paper for more information about the experimental setup.

Dependencies

The experiments depend on Pytorch and HuggingFace's Transformer repo. We use the official code of respective papers to replicate their results (e.g., GeDi and PPLM]).

Setup

Create a new Anaconda environment and run the following:

./setup.sh

This will clone the PPLM and GeDi submodules, and install their dependencies.

As PPLM and GeDi require different HuggingFace Transformers versions, this script will also install both version 2.8 and version 3.4 as different pip packages.

Then, add your Perspecitve API key to scripts/score_generations.py if you need to score data/generations.

Getting Started

Each of the controllable generation methods are placed in separate submodule/folders. Specifics of note:

Examples of how to run training, generation, and evaluation for all the methods are available in the Makefile. Each of these commands references scripts in the scripts/ folder.

scripts/ is organized as follows:

score_generations.py can be flexibly used on any .txt file with the Perspective API and automatically resumes scoring if an error occurs.

References

Please consider citing our work if you found this code or our paper beneficial to your research.

@inproceedings{Xu2021Detoxifying,
      Title = {Detoxifying Language Models Risks Marginalizing Minority Voices}, 
      Author = {Albert Xu and Eshaan Pathak and Eric Wallace and Suchin Gururangan and Maarten Sap and Dan Klein},
      Booktitle = {North American Chapter of the Association for Computational Linguistics}
      year={2021}
}

Contributions and Contact

This code was developed by Albert Xu, Eric Wallace, and Eshaan Pathak. Contact us at albertxu3@berkeley.edu, ericwallace@berkeley.edu and eshaanpathak@berkeley.edu, respectively.