Home

Awesome

PyTorch Implementation of NER with pretrained Bert

I know that you know BERT. In the great paper, the authors claim that the pretrained models do great in NER. It's even impressive, allowing for the fact that they don't use any prediction-conditioned algorithms like CRFs. We try to reproduce the result in a simple manner.

Requirements

Training & Evaluating

bash download.sh

It should be extracted to conll2003/ folder automatically.

python train.py --logdir checkpoints/feature --batch_size 128 --top_rnns --lr 1e-4 --n_epochs 30
python train.py --logdir checkpoints/finetuning --finetuning --batch_size 32 --lr 5e-5 --n_epochs 3

Results in the paper

<img src="bert_ner.png"> <img src="bert_ner_finetuning.png">

Results

epochfeature-basedfine-tuning
10.20.95
20.750.95
30.840.96
40.88
50.89
60.90
70.90
80.91
90.91
100.92
110.92
120.93
130.93
140.93
150.93
160.92
170.93
180.93
190.93
200.93
210.94
220.94
230.93
240.93
250.93
260.93
270.93
280.93
290.94
300.93