Awesome
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
Introduction
This repository contains the code and data for replicating results from
- Gender Bias in Multilingua Embeddings and Corss-Lingual Transfer
- Jieyu Zhao, Subhabrata Mukherjee, Saghar Hosseini, Kai-Wei Chang, Ahmed Hassan Awadallah.
- In ACL 2020
Intrinsic Bias
- Prerequisite
- Download/Generate fastText aligned embeddings from fastText
- Generate bias-reduced EN embeddings (ENDEB) using Hard-Debias
- Multilingual Intrinsic Bias Dataset:
We include all the occupations as well as the gender seed words for each language under intrinsic folder.
- Codes:
To evaluate intrinsic bias in each language, refer to inBias.ipynb for bias analysis and results.
Extrinsic Bias
- Multilingual BiosBias (MLBs) Dataset:
To replicate the MLBs dataset, please refer to replicateMLBs folder. For EN dataset, please refer to biosbias
- Codes:
The codes for downstream task is under bios_codes folder.
If you use this code or use the EN MLB dataset, please also cite Bias in Bios: A Case Study of Semantic Representation Bias in a High Stakes Setting
@inproceedings{de2019bias,
title={Bias in bios: A case study of semantic representation bias in a high-stakes setting},
author={De-Arteaga, Maria and Romanov, Alexey and Wallach, Hanna and Chayes, Jennifer and Borgs, Christian and Chouldechova, Alexandra and Geyik, Sahin and Kenthapadi, Krishnaram and Kalai, Adam Tauman},
booktitle={Proceedings of the Conference on Fairness, Accountability, and Transparency},
pages={120--128},
year={2019}
}
Citation
@inproceedings{zhao-etal-2020-gender,
title = "Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer",
author = "Zhao, Jieyu and
Mukherjee, Subhabrata and
Hosseini, saghar and
Chang, Kai-Wei and
Hassan Awadallah, Ahmed",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
year = "2020",
publisher = "Association for Computational Linguistics",
pages = "2896--2907",
}