Home

Awesome

Roman Urdu Hate-speech and Offensive Language Detection

Embeddings

The embeddings could be found at: https://drive.google.com/drive/folders/1_ZeoYMyBTb2sROeKmxS0quVrbvJBrct2?usp=sharing

Note that ver1 is trained on 0.3 million tweets only while ver2 is trained on 4.7 million tweets.

label_definitions.txt contains the mapping for the labels for both tasks (i.e., coarsegrained and finegrained labels).

Reference

@inproceedings{rizwan2020hate,
  title={Hate-speech and offensive language detection in roman Urdu},
  author={Rizwan, Hammad and Shakeel, Muhammad Haroon and Karim, Asim},
  booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  pages={2512--2522},
  year={2020}
}