Home

Awesome

Prompts4Keras

Run the experiments in our paper "NSP-BERT: A Prompt-based Few-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction" Prompt-learning methods are used BERT4Keras (PET, EFL and NSP-BERT), both for Chinese and English.

Overview

In order to better compare NSP-BERT and other two basic prompt-learning methods based on MLM and NLI in Chinese and English two languages, and can easily conduct experiments on the BERT4Keras framework, especially transfer the original English RoBERTa model to the BERT4Keras framework, we developed this repository.

Target

Mainly for text classification tasks in zero-shot and few-shot learning scenarios.

Supported Methods

Supported Models

NOTE: We need to use scripts in ./tools/... to convert the pytorch model to the tensorflow model we used.

Environments

Different from the baselines, this repository all uses the BERT4Keras framework, which is completely based on tensorflow.

Since it needs to run on a graphics card of the Ampere framework (such as A100, RTX 3090), we need to install the NVIDIA version of tensorflow.

bert4keras==0.10.8
fairseq==0.10.2
keras==2.6.0
nvidia_tensorflow==1.15.4+nv20.11
scikit_learn==1.0.2
scipy==1.3.1
torch==1.7.0
transformers==4.12.3

Datasets

FewCLUE datasets can be downloaded here

English datasets should download by here and use the script generate_k_shot_data.py.

Yahoo! and AGNews should use the script, too.

Reproduce experiments

  1. Downloading the models
  1. Convert pytorch models to tf
  1. Using run_experiment.sh or run_nsp_bert.sh and other scripts to reproduce our experiments. For each few-shot learning task, we divide the training set and dev set according to 5 random seeds, and conduct experiments separately.

Models Pre-trained by Ourselves

BERT-Large-Mix5-5M

Link:https://share.weiyun.com/MXroU3g1 Code:2ebdf4

https://huggingface.co/sunyilgdx/mixr/tree/main

Scripts

for i in 1 2 3 4 5
do
  python ./nsp_bert/nsp_classification.py \
  --method few-shot \
  --n_th_set $i \
  --device 0 \
  --dataset_name SST-2 \
  --batch_size 8 \
  --learning_rate 2e-5 \
  --loss_function BCE \
  --model_name bert_large
done
for i in 1 2 3 4 5
do
  python ./nsp_bert/nsp_classification.py \
  --method few-shot \
  --n_th_set $i \
  --device 0 \
  --dataset_name EPRSTMT \
  --batch_size 8 \
  --learning_rate 1e-5 \
  --loss_function BCE \
  --model_name chinese_bert_base
done

Citation

@inproceedings{sun-etal-2022-nsp,
    title = "{NSP}-{BERT}: A Prompt-based Few-Shot Learner through an Original Pre-training Task {---}{---} Next Sentence Prediction",
    author = "Sun, Yi  and
      Zheng, Yu  and
      Hao, Chao  and
      Qiu, Hangping",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics",
    url = "https://aclanthology.org/2022.coling-1.286",
    pages = "3233--3250"
}