Home

Awesome

Using Semantics to Understand Fake News

Code for the EMNLP 2019 workshop (TextGraphs) paper "Do Sentence Interactions Matter ? Leveraging Sentence Level Representations for Fake News Classification"

Make sure the following files are present as per the directory structure before running the code,

fake_news_semantics
│   README.md
│   *.py
│   
└───data
    │   balancedtest.csv
    │   fulltrain.csv
    |   test.xsls

balancedtest.csv and fulltrain.csv can be obtained from https://drive.google.com/file/d/1njY42YQD5Mzsx2MKkI_DdtCk5OUKgaqq/view?usp=sharing

test.xsls is basically the SLN dataset according to the paper. You can obtain this dataset from http://victoriarubin.fims.uwo.ca/news-verification/data-to-go/ Contact me if you have trouble finding these datasets.

Dependencies,

pytorch 1.0.0
pandas
tqdm
xlrd (pip install xlrd)
bert-pytorch (pip install pytorch-pretrained-bert)

To train a LSTM model, run the following command,

python main.py --batch_size 1024 --config lstm --encoder 0 --mode 0

To train a CNN model, run the following command,

python main.py --batch_size 1024 --config cnn --encoder 1 --mode 0

To train a BERT model, run the following command,

python bert_classifier.py --batch_size 4 --max_epochs 10 --max_seq_length 500 --max_sent_length 70 --mode 0

To train a GCN based model, run the following command,

python main.py --batch_size 32 --max_epochs 10 --config gcn --max_sent_len 50 --encoder 2 --mode 0

To train a GCN based model with attention, run the following command,

python main.py --batch_size 32 --max_epochs 10 --config gcn_attn --max_sent_len 50 --encoder 3 --mode 0

To train a GATconv based model, run the following command,

python main.py --batch_size 32 --max_epochs 10 --config gat --max_sent_len 50 --encoder 4 --mode 0

To test the accuracy of the model on the out of domain test set, run the following command,

For the LSTM model,

python main.py --batch_size 1024 --encoder 0 --model_file model_lstm.t7 --mode 1

For the CNN model,

python main.py --batch_size 1024 --encoder 1 --model_file model_cnn.t7 --mode 1

For the Bert model,

python bert_classifier.py --batch_size 4 --model_file model_bert.t7 --max_seq_length 500 --max_sent_length 70 --mode 1

For the GCN model,

python main.py --batch_size 32 --max_sent_len 50 --encoder 2 --model_file model_gcn.t7 --mode 1

For the GCN model with attention,

python main.py --batch_size 32 --max_sent_len 50 --encoder 3 --model_file model_gcn_attn.t7 --mode 1

For the GATconv model,

python main.py --batch_size 32 --max_sent_len 50 --encoder 4 --model_file model_gat_attn.t7 --mode 1

Baseline Results

Out of domain test set accuracy

ModelAccPrecRecallF1
CNN67.567.567.567.4
LSTM81.482.281.481.3
BERT78.178.178.178.0
LSTM + GCN + Max Pool85.085.985.085.1
LSTM + GCN + Max Pool + Semantic Adj86.486.486.386.4
LSTM + GCN + Self Attn86.687.186.986.9
LSTM + GCN + Self Attn + Semantic Adj87.887.887.887.8
LSTM + GAT86.186.286.186.1
LSTM + GAT + Semantic Adj87.587.587.587.4
LSTM + GAT + 2 Attn Heads88.689.188.988.9
LSTM + GAT + 2 Attn Heads + Semantic Adj84.785.284.784.6
SoTA-88.082.0

Results with a dev/test split based on news sources: This might be a more realistic split

For two classes Satire / Trusted

In domain dev set accuracy

ModelAccPrecRecallF1
CNN96.8296.8496.6296.73
LSTM95.6595.6495.4195.52
BERT91.7292.7490.5691.31
LSTM + GCN + Max Pool98.0898.1297.8998.02
LSTM + GCN + Max Pool + Semantic Adj96.7797.5797.8597.7
LSTM + GCN + Attn98.2798.0598.4298.22
LSTM + GCN + Attn + Semantic Adj98.1798.1598.0698.11
LSTM + GAT98.3698.4498.1298.29
LSTM + GAT + Semantic Adj98.2598.2998.0998.19
LSTM + GAT + 2 Attn Heads98.4498.4498.3498.39
LSTM + GAT + 2 Attn Heads + Semantic Adj98.0298.0197.997.95

Out of domain test set 1 accuracy

ModelAccPrecRecallF1
CNN67.567.7967.567.37
LSTM81.1182.1281.1180.96
BERT75.8376.6275.8375.65
LSTM + GCN + Max Pool85.8386.1685.8385.8
LSTM + GCN + Max Pool + Semantic Adj83.8984.7383.8983.79
LSTM + GCN + Attn85.2785.5985.2785.24
LSTM + GCN + Attn + Semantic Adj85.5685.5785.5685.55
LSTM + GAT86.3986.4486.3886.38
LSTM + GAT + Semantic Adj85.2785.3185.2785.27
LSTM + GAT + 2 Attn Heads84.7285.6584.7284.62
LSTM + GAT + 2 Attn Heads + Semantic Adj86.9487.0486.9486.94
SoTA-88.082.0

Out of domain test set 2 accuracy

ModelAccPrecRecallF1
CNN91.1391.2891.1391.12
LSTM91.5391.5491.5391.53
BERT83.4683.5683.4683.45
LSTM + GCN + Max Pool92.692.6192.5992.59
LSTM + GCN + Max Pool + Semantic Adj89.7390.5789.7389.68
LSTM + GCN + Self Attn91.2691.9991.2691.22
LSTM + GCN + Self Attn + Semantic Adj92.492.5392.3992.39
LSTM + GAT94.294.2194.294.19
LSTM + GAT + Semantic Adj92.692.6992.5992.59
LSTM + GAT + 2 Attn Heads89.6690.3789.6789.62
LSTM + GAT + 2 Attn Heads + Semantic Adj92.8693.0692.8792.86

For four classes Satire, Hoax, Propaganda and Trusted

In domain dev set accuracy

ModelAccPrecRecallF1
CNN96.4896.4196.1896.28 / 96.48
LSTM88.7588.6788.1188.35 / 88.75
BERT95.0794.8194.5794.68 / 95.07
LSTM + GCN + Max Pool96.7696.6196.5896.59 / 96.76
LSTM + GCN + Max Pool + Semantic Adj
LSTM + GCN + Attn97.5797.2597.6397.43 / 97.57
LSTM + GCN + Attn + Semantic Adj
LSTM + GAT97.7397.997.3697.62 / 97.28
LSTM + GAT + Semantic Adj
LSTM + GAT + 2 Attn Heads97.897.6997.7497.71 / 97.82
LSTM + GAT + 2 Attn Heads + Semantic Adj
SoTA---91.0

Out of domain test set 2 accuracy

ModelAccPrecRecallF1
CNN54.0354.554.0352.6 / 54.03
LSTM55.0658.8855.0652.5 / 55.05
BERT55.5657.4554.8654.0 / 54.87
LSTM + GCN + Max Pool65.066.7564.8463.79 / 65.0
LSTM + GCN + Max Pool + Semantic Adj
LSTM + GCN + Attn67.0868.667.066.42 / 67.08
LSTM + GCN + Attn + Semantic Adj
LSTM + GAT65.569.4565.3363.83 / 65.51
LSTM + GAT + Semantic Adj
LSTM + GAT + 2 Attn Heads66.9468.0566.8666.37 / 66.95
LSTM + GAT + 2 Attn Heads + Semantic Adj
SoTA---65.0

For more structured results, refer to the tables in the paper. The following results are for document classification when applied to non-fake news domain.

Document classification

AG News (4 news categories)

ModelAccTest Error Rate
GAT89.6110.39
GAT + 2 Attn Heads89.7210.28
SOTA5.01

IMDB (2 sentiment categories)

ModelAccTest Error Rate
GAT
GAT + 2 Attn Heads
SOTA4.6

DBPedia (14 ontology categories)

ModelAccTest Error Rate
GAT99.13
GAT + 2 Attn Heads
SOTA0.80

If you find this work useful in your research, please consider citing the paper using following bibtex:

Bibtex

If you found this work or code useful for your research, please cite us!

@inproceedings{vaibhav-etal-2019-sentence,
    title = "Do Sentence Interactions Matter? Leveraging Sentence Level Representations for Fake News Classification",
    author = "Vaibhav, Vaibhav  and
      Mandyam, Raghuram  and
      Hovy, Eduard",
    booktitle = "Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)",
    month = nov,
    year = "2019",
    address = "Hong Kong",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-5316",
    doi = "10.18653/v1/D19-5316",
    pages = "134--139",
    abstract = "The rising growth of fake news and misleading information through online media outlets demands an automatic method for detecting such news articles. Of the few limited works which differentiate between trusted vs other types of news article (satire, propaganda, hoax), none of them model sentence interactions within a document. We observe an interesting pattern in the way sentences interact with each other across different kind of news articles. To capture this kind of information for long news articles, we propose a graph neural network-based model which does away with the need of feature engineering for fine grained fake news classification. Through experiments, we show that our proposed method beats strong neural baselines and achieves state-of-the-art accuracy on existing datasets. Moreover, we establish the generalizability of our model by evaluating its performance in out-of-domain scenarios. Code is available at https://github.com/MysteryVaibhav/fake{\textbackslash}{\_}news{\textbackslash}{\_}semantics.",
}