Awesome

Catalyst.Bert

A barebones (Distil)BERT pipeline for token classification tasks driven by catalyst.

    pip install -e .

Check experiment.py for loading train/test data. At the moment the pipeline assumes two JSON lines files containing ['content', 'tagged_attributes'] columns, where tagged_attributes is a list of substrings in content.
Possibly modify dataset.py to suit your data preprocessing needs. The pipeline makes assumption that there are two classes of tokens.
Start training your model

catalyst-dl run -C bert_ner/config.yml

Run the following command to see metrics in Tensorboard

    CUDA_VISIBLE_DEVICE="" tensorboard --logdir=./logs