Home

Awesome

Instructions

[THIS REPOSITORY IS UNDER DEVELOPMENT AND MOER DATASETS AND MODELS WILL BE ADDED]

[FEEL FREE TO MAKE PULL REQUEST FOR A NEW DATASET OR NEW MODEL]

1. Requirements

Run setup.sh to download the datasets and install all the required packages.

Run prepare_datasets.py notebook to prepare the datasets.

For instruction regarding running each model go the respective model directory.

The models directory holds the result of these experiments.

Bert 20NG Confusion MatrixBert 20NG Sankey Plot
<img src="https://github.com/yaserkl/BERTvsULMFIT/raw/master/models/bert/20ng/cm_20ng.png" alt="20 Newsgroup Confusion Matrix"><img src="https://github.com/yaserkl/BERTvsULMFIT/raw/master/models/bert/20ng/sankey_20ng.png" alt="20 Newsgroup Sankey Plot">

2. Results

2.1 BERT

Bert (MXNet)F1-scorePrecisionRecallAccuracyError Rate
20ng91.2491.4691.1391.048.96
IMDB88.5988.6188.6288.611.4
Reuters 21578 (R8)94.3893.6295.6498.121.88
Reuters 21578 (R52)73.8073.4876.0196.353.65
Ohsumed (all docs)70.4573.9768.8479.3020.70
Ohsumed (first 20k docs)56.5261.4956.0471.0428.96

2.2 ULMFit

ULMFitF1-scorePrecisionRecallAccuracyError Rate
20ng92.8793.0292.8292.827.18
IMDB91.9291.9691.9691.928.08
Reuters 21578 (R8)94.7994.0796.1298.181.82
Reuters 21578 (R52)73.7775.4775.9696.433.57
Ohsumed (all docs)74.8275.0175.4781.9618.04
Ohsumed (first 20k docs)43.7644.4645.4962.537.5