Home

Awesome

Debiaswe: try to make word embeddings less sexist

🔴FAT* 2018 tutorial slides

Here we have the code and data for the following paper: Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings by Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. Proceedings of NIPS 2016.

Just looking to download a debiased embedding?

You can download binary/txt hard debiased version of the Google's Word2Vec embedding trained on Google News (Origin: GoogleNews-vectors-negative300.bin.gz found here).

Python scripts:

python learn_gender_specific.py ../embeddings/GoogleNews-vectors-negative300.bin 50000 ../data/gender_specific_seed.json gender_specific_full.json
python debias.py ../embeddings/GoogleNews-vectors-negative300.bin ../data/definitional_pairs.json ../data/gender_specific_full.json ../data/equalize_pairs.json ../embeddings/GoogleNews-vectors-negative300-hard-debiased.bin

We also have seed data used to debias and crowd data used to evaluate the embeddings.

Data files:

(All external files that I refer within this repo can be found in this folder.)