


This repository contains the code and data for the paper "On Measuring Social Biases in Sentence Encoders" by Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman and Rachel Rudinger.


Environment setup

First, install Anaconda and a C++ compiler (for example, g++) if you do not have them.

Using the prespecified environment

Use environment.yml to create a conda environment with all necessary code dependencies:

conda env create -f environment.yml

Activate the environment as follows:

source activate sentbias

Recreating the environment

Alternatively (for example, if you have problems using the prespecified environment), follow approximately the following steps to recreate it. First, create a new environment with Python 3.6:

conda create -n sentbias python=3.6

Then activate the environment and add the remaining dependencies:

source activate sentbias
conda install pytorch=0.4.1 cuda90 -c pytorch
conda install tensorflow
pip install allennlp gensim tensorflow-hub pytorch-pretrained-bert numpy scipy nltk spacy h5py scikit-learn

Environment postsetup

Now, with the environment activated, download the NLTK punkt and spacy en resources:

python -c 'import nltk; nltk.download("punkt")'
python -m spacy download en

You will also need to download pretrained model weights for each model you want to test. Instructions for each supported model are as follows.

Bag-of-words (bow); also GenSen and InferSent

Several models require GloVe words vectors. Download and unzip the GloVe Common Crawl 840B 300d vectors from the Stanford NLP GloVe web page:

wget http://nlp.stanford.edu/data/glove.840B.300d.zip
unzip glove.840B.300d.zip

Make note of the path to the resultant text file; you will need to pass it to sentbias/main.py using the --glove_path flag.


BERT weights will be downloaded from Bert repo and cached at runtime. Set PYTORCH_PRETRAINED_BERT_CACHE in your environment to a directory you'd like them to be saved to; otherwise they will be saved to ~/.pytorch_pretrained_bert. For example, if using bash, run this before running BERT bias tests or put it in your ~/.bashrc and start a new shell session to run bias tests:

export PYTORCH_PRETRAINED_BERT_CACHE=/data/bert_cache

GPT (OpenAI)

Evaluation of GPT is supported through the jiant project. To generate GPT predictions for evaluating in SEAT, first initialize and update the jiant code:

git submodule update --init --recursive

With jiant initialized, change your current directory to jiant. The rest of the commands in this section should be run in that directory.

cd jiant

Now create a conda environment with core jiant dependencies:

conda env create -f environment.yml

Activate that environment and collect the remaining Python dependencies:

source activate jiant
python -m nltk.downloader perluniprops nonbreaking_prefixes punkt
pip install python-Levenshtein ftfy
conda install tensorflow
python -m spacy download en

Next we need to set a few environment variables. Change the value of ROOT_DIR to a directory on a filesystem with at least six gigabytes of free space; the directory will be created if it doesn't exist:


export NFS_PROJECT_PREFIX="$ROOT_DIR/ckpts/jiant"
export JIANT_PROJECT_PREFIX="$ROOT_DIR/ckpts/jiant"
export WORD_EMBS_FILE="$WORD_EMBS_DIR/crawl-300d-2M.vec"

Download fasttext vectors:

mkdir -p $WORD_EMBS_DIR
curl -L https://dl.fbaipublicfiles.com/fasttext/vectors-english/crawl-300d-2M.vec.zip -o ${WORD_EMBS_FILE}.zip

Retokenize the SEAT tests using BPE:

mkdir -p $JIANT_DATA_DIR
cp -r ../tests $JIANT_DATA_DIR/WEAT
python probing/retokenize_weat_data.openai.py $JIANT_DATA_DIR/WEAT/*.jsonl

Now we put a comma-separated list of the jiant tasks we want to run in target_tasks:


To produce the GPT representations of the SEAT data at last, run extract_repr.py on those tasks (this may take a while):

python extract_repr.py --config config/bias.conf --overrides "target_tasks = \"$target_tasks\", exp_name = sentbias-openai, run_name = openai, word_embs = none, elmo = 0, openai_transformer = 1, sent_enc = \"null\", skip_embs = 1, sep_embs_for_skip = 1, allow_missing_task_map = 1, combine_method = last"

The representations will be saved to files of the form TASK.encs (for example, angry_black_woman_stereotype-openai.encs) in the directory $JIANT_PROJECT_PREFIX/sentbias-openai/openai. To apply SEAT To them, we first need to strip the -openai part from the filenames:

for f in $JIANT_PROJECT_PREFIX/sentbias-openai/openai/*-openai.encs
    mv $f ${f%-openai.encs}.encs

Finally, pass the directory path $JIANT_PROJECT_PREFIX/sentbias-openai/openai to sentbias/main.py using the --openai_encs flag.


ELMo weights will be downloaded from allennlp repo and cached at runtime. Set ALLENNLP_CACHE_ROOT in your environment to a directory you'd like them to be saved to; otherwise they will be saved to ~/.allennlp. For example, if using bash, run this before running ELMo bias tests or put it in your ~/.bashrc and start a new shell session to run bias tests:

export ALLENNLP_CACHE_ROOT=/data/allennlp_cache


Download the model checkpoints from the GenSen repo:

wget https://genseniclr2018.blob.core.windows.net/models/nli_large_bothskip_2layer_vocab.pkl
wget https://genseniclr2018.blob.core.windows.net/models/nli_large_bothskip_2layer.model
wget https://genseniclr2018.blob.core.windows.net/models/nli_large_bothskip_parse_vocab.pkl
wget https://genseniclr2018.blob.core.windows.net/models/nli_large_bothskip_parse.model
wget https://genseniclr2018.blob.core.windows.net/models/nli_large_bothskip_vocab.pkl
wget https://genseniclr2018.blob.core.windows.net/models/nli_large_bothskip.model
wget https://genseniclr2018.blob.core.windows.net/models/nli_large_vocab.pkl
wget https://genseniclr2018.blob.core.windows.net/models/nli_large.model

Make a note of the directory you download them to; you will need to pass it to sentbias/main.py using the --gensen_dir flag.

You will also need to process your GloVe word vectors into an HDF5 format. To do this run scripts/glove2h5.py on the path to your GloVe vectors:

python scripts/glove2h5.py path/to/glove.840B.300d.txt


Download the AllNLI InferSent model checkpoint (Facebook has deleted this version; we are temporarily hosting a copy):

wget http://sent-bias.s3-website-us-east-1.amazonaws.com/infersent.allnli.pickle

Make a note of the directory you download them to; you will need to pass it to sentbias/main.py using the --infersent_dir flag.

Universal Sentence Encoder (Google)

Universal Sentence Encoder weights will be downloaded from Universal Sentence Encoder repo and cached at runtime. Set TFHUB_CACHE_DIR in your environment to a directory you'd like them to be saved to; otherwise they will be saved to /tmp/tfhub_modules. For example, if using bash, run this before running Universal Sentence Encoder bias tests or put it in your ~/.bashrc and start a new shell session to run bias tests:

export TFHUB_CACHE_DIR=/data/tfhub_cache

Running Bias Tests

We provide a script that demonstrates how to run the bias tests for each model. To use it, minimally set the path to the GloVe vectors as GLOVE_PATH in a file called user_config.sh:


Then copy scripts/run_tests.sh to a temporary location, edit as desired, and run it with bash.


To run bias tests directly, run main with one or more tests and one or more models. Note that each model may require additional command-line flags specifying locations of resources and other options. For example, to run all tests against the bag-of-words (GloVe) and ELMo models:

python sentbias/main.py -m bow,elmo --glove_path path/to/glove.840B.300d.txt

If they are available, cached sentence representations in the output directory will be loaded and used; if they are not available, they will be computed (and cached under output). Run python sentbias/main.py --help to see a full list of options.

Code Tests

To run style checks, first install flake8:

pip install flake8

Then run it as follows:



This code is distributed under the Creative Commons Attribution-NonCommercial 4.0 International license, which can be found in the LICENSE file in this directory.

The file sentbias/models.py is based on models.py in InferSent with small modifications by us (May, Wang, Bordia, Bowman, and Rudinger); the original file is copyright Facebook, Inc. under the Creative Commons Attribution-NonCommercial 4.0 International license.

The file sentbias/encoders/gensen.py is based on gensen.py in gensen with small modifications by us (May, Wang, Bordia, Bowman, and Rudinger); the original file is copyright Microsoft Corporation under the MIT license.