Awesome
keras-bert-ner
Named entity recognition built on top of BERT and keras-bert.
Dependencies:
bert (added as submodule to this project. FullTokenizer is used instead of keras-bert tokenizer)
keras-bert (https://pypi.org/project/keras-bert/)
Pretrained BERT model, e.g. from:
input data e.g. from:
Input data is expected to be in CONLL:ish format where Token and Tag are tab separated. First string on the line corresponds to Token and second string to Tag
Quickstart
Get submodules
git submodule init
git submodule update
Get pretrained models and data
./scripts/get-models.sh
./scripts/get-finer.sh
./scripts/get-turku-ner.sh
Experiment on Turku NER corpus data (run-turku-ner.sh
trains, predict-turku-ner.sh
outputs predictions)
./scripts/run-turku-ner.sh
./scripts/predict-turku-ner.sh
python compare.py data/turku-ner/test.tsv turku-ner-predictions.tsv
Run an experiment on FiNER news data
./scripts/run-finer-news.sh
./scripts/predict-finer-news.sh
python compare.py data/finer-news/test.tsv finer-news-predictions.tsv
If in a Slurm environment, edit scripts/slurm-run.sh
to match your setup and run
sbatch scripts/slurm-run.sh scripts/run-finer-news.sh
sbatch scripts/slurm-run.sh scripts/predict-finer-news.sh
python compare.py data/finer-news/test.tsv finer-news-predictions.tsv
(the first job must finish before running the second.)