Home

Awesome

Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech

pipeline_new

This work aims at generating knowledge-bound counter narratives, using 2 modules, knowledge retrieval module and counter narrative generation module.

Requirements:

Java 1.8+
Solr
Keyphrase digger

transformers
rouge_score
spaCy

Knowledge Retrieval Module

Under KN_CONAN_final_data, we provide final CONAN dataset paired with corresponding silver knowledge. If you wish to prepare your own knowledge repository, check the steps below. Otherwise, skip this section.

  1. Download CONAN dataset and knowledge repository
  2. Prepare queries
  3. Retrieve relevant knowledge
  4. Select knowledge sentences

1. Download Data

1.1 Hate countering dataset

1.2. Knowledge Repository

We use the following datasets for creating relevant knowledge.

2. Prepare Queries

2.1. Query extraction

We use Keyphrase Digger to extract keyphrase queries for both hate speech and counter narratives in CONAN.

2.2. Query generation

We use transformer implementation to train and generate keyphrase queries.

3. Retrieve relevant knowledge

Retrieve relevant knowledge using Solr, run retrieve_kn_solr.py)

Solr is used to index articles in knowledge repository and retrieve relevant knowledge given a query.

Some solr commands:

Check this tutorial on how to install solr, index data and advanced methods for searching data in detail.

4. Select knowledge sentences

  1. Apply knowledge sentence selector to get the top-N knowledge sentences and save it in a single file, 1 entry per line, run kn_sentence_retriever.py
  2. Create train, valid, and test data, run create_modelling_data.py.

Counter Narrative Generation Module

Multi-domain Knowledge-grounded hate countering dataset

The Gold Knowledge Test Set can be downloaded here, containing hate speech, counter-narrative pairs coupled with relevant backgroud knowledge. It consists of 195 pairs covering multiple hate targets (islamophobia, misogyny, antisemitism, racism, and homophobia).

Citation

For more details on data partition procedure, please see our paper.

@inproceedings{chung-etal-2021-towards,
    title = "Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech",
    author = "Chung, Yi-Ling  and
      Tekiro{\u{g}}lu, Serra Sinem  and
      Guerini, Marco",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.79",
    doi = "10.18653/v1/2021.findings-acl.79",
    pages = "899--914",
}