Home

Awesome

BOND

This repo contains our code and pre-processed distantly/weakly labeled data for paper BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision (KDD2020)

BOND

BOND-Framework

Benchmark

The reuslts (entity-level F1 score) are summarized as follows:

MethodCoNLL03TweetOntoNote5.0WebpageWikigold
Full Supervision91.2152.1986.2072.3986.43
Previous SOTA76.0026.1067.6951.3947.54
BOND81.4848.0168.3565.7460.07

Data

We release five open-domain distantly/weakly labeled NER datasets here: dataset. For gazetteers information and distant label generation code, please directly email cliang73@gatech.edu.

Environment

Python 3.7, Pytorch 1.3, Hugging Face Transformers v2.3.0.

Training & Evaluation

We provides the training scripts for all five open-domain distantly/weakly labeled NER datasets in scripts. E.g., for BOND training and evaluation on CoNLL03

cd BOND
./scripts/conll_self_training.sh

For Stage I training and evaluation on CoNLL03

cd BOND
./scripts/conll_baseline.sh

The test reuslts (entity-level F1 score) are summarized as follows:

MethodCoNLL03TweetOntoNote5.0WebpageWikigold
Stage I75.6146.6168.1159.1152.15
BOND81.4848.0168.3565.7460.07

Citation

Please cite the following paper if you are using our datasets/tool. Thanks!

@inproceedings{liang2020bond,
  title={BOND: Bert-Assisted Open-Domain Named Entity Recognition with Distant Supervision},
  author={Liang, Chen and Yu, Yue and Jiang, Haoming and Er, Siawpeng and Wang, Ruijia and Zhao, Tuo and Zhang, Chao},
  booktitle={ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  year={2020}
}