Awesome
Distilled BERT
This work aims Knowledge Distillation from Google BERT model to compact Convolutional Models. (Not done yet)
Requirements
Python > 3.6, fire, tqdm, tensorboardx, tensorflow (for loading checkpoint file)
Example Usage
Fine-tuning (MRPC) Classifier with Pre-trained Transformer
Download BERT-Base, Uncased and GLUE Benchmark Datasets before fine-tuning.
- make sure that "total_steps" in train.json should be greater than n_epochs*(num_data/batch_size)
Modify several config json files before following commands for training and evaluating.
python finetune.py config/finetune/mrpc/train.json
python finetune.py config/finetune/mrpc/eval.json
Training Blend CNN from scratch
See Transformer to CNN. Modify several config json files before following commands for training and evaluating.
python classify.py config/blendcnn/mrpc/train.json
python classify.py config/blendcnn/mrpc/eval.json
Knowledge Distillation from finetuned Transformer to CNN
Modify several config json files before following commands for training and evaluating.
python distill.py config/distill/mrpc/train.json
python distill.py config/distill/mrpc/eval.json