Awesome

[ACL 2024] A Codebase for Incremental Learning with Large Language Models

Introduction
Supported List
Usage

Introduction

This is a repository for Incremental Learning with Large Language Models.

It supports both generative and discriminative models in transformers.
It supports using accelerate for distributed data parrallel and model parallel.
It supports using wandb for logging.

Supported List

Scenario

Instance-Incremental Learning
Class-Incremental Learning
Task-Incremental Learning
Continual Instruction Tuning (Coming soon!)
Continual Knowledge Editing (Coming soon!)

Tasks

Text Classification
Intent Classification
Relational Extraction
Named Entity Recognition

Methods

More baselines will be released in the future!

General (Text/Intent) Classification

Named Entity Recognition

Original for Image Classification

Datasets

Instance Incremental Learning

Concept-1K (The raw and the preprocessed Concept-1K are included in dataset/concept_1k, dataset/concept_1k_task10, dataset/concept_1k_task1).

Intent Classification

Topic3datasets (agnews, dbpedia, yahoo)

Intent Classification

CLINC150
Banking77

Relation Extraction

FewRel
TACRED

Named Entity Recognition

Few-NERD
Ontonotes5
I2B2

Best Practice to Use this Codebase

How to reproduce the performance of SEQ and SEQ*?

The config file of SEQ (just sequential fine-tuning) can be found in the SEQ_full.yaml (in the config directory). The config file of SEQ* can be found in the SEQ_pre_warm_fix.yaml. Note that the classifier type (linear or cosine linear) is not specified in all config files because we set it the script. An example can be found in https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm/blob/main/reproduce_shell/exp-CIL-sota/SOTA-CIL-Intent-discriminative-banking77_task7.sh.

Usage

Overview

.
├── main_CL.py              # This this the python file to be executed for running all experiments
├── utils                       # This folder contains all basic files for incremental learning 
│   ├── backbone.py             # This file loads backbone models from the transformers library
│   ├── buffer.py               # This file defines the replay buffer
│   ├── classifier.py           # This file loads Linear/CosineLinear classifiers
│   ├── wrapmodel.py            # This file wrap the model for using DeepSpeed with accelerate
│   ├── dataformat_preprocess.py# This file preprocess the raw datasets to the continual learning dataset
│   ├── dataloader.py           # This file prepare the input for languge models
│   ├── dataset.py              # This file defines the format for different datasets for continual learning
│   ├── download_backbones.py   # This file downloads models in advance to avoid network problem.
│   ├── evaluation.py           # This file defines the evaluation process for various tasks
│   ├── factory.py              # This file loads the various models from the ./models folder
│   ├── logger.py               # This file defines the logger
│   ├── metric.py               # This file defines the evaluation metric for continual learning
│   ├── optimizer.py            # This file defines the optimizer for different models
│   ├── prompt.py               # This file defines the prompt used for different tasks
│   ├── probing.py              # This file computes the probing performance
│   └── config.py               # This file defines general parameters and settings for the experiments
├── config                  # This folder contains the hyper-parameters for each methods in each datasets
├── dataset                 # This folder contains datasets for continual learning
├── models                  # This folder contains models for continual learning
└── experiments             # This folder contains log data for each run

Quick Start

Step 1: prepare the environment

pip install -r requirement.txt

Step 2: prepare the dataset

Check the support_dataset_list in utils/dataformat_preprocess.py and select the dataset you want for experiment.

Then, download the raw dataset to the folder dataset/{dataset-name}. For example, download the clinc150 to the folder dataset/clinc150. The raw datasets can be downloaded here. We note that the raw data of Conept-1K is in dataset/concept_1k. The preprocessed Concept-1K for 10 step incremental learning is in dataset/concept_1k_task10. The whole Concept-1K is in dataset/concept_1k_task1.

Next, exceute the preprocess_dataset.sh. It will automatically preprocess 8 default datasets for reproducing results ('topic3datasets','clinc150','banking77', 'fewrel','tacred','conll2003','fewnerd','i2b2','ontonotes5') and create new folders in datasets/{dataset-for-continual-learning-name} automatically (e.g.,backing_task7). If you do not need to customize the datasets, you can skip to Step 3.

To customize the datasets, you can run utils/dataformat_preprocess.py with your own parameters (e.g., random seeds, num of tasks). This process will create a new target folder dataset/{dataset-for-continual-learning-name}. In the target folder, two json files continual_data.json and continual_config.json will be saved. For example, you can prepare clinc150 and fewrel dataset by runing

python utils/dataformat_preprocess.py --dataset clinc150 --seed 1

and

python utils/dataformat_preprocess.py --dataset fewrel --seed 1

The program will create target folders dataset/clinc150_task15 and dataset/fewrel_task8.

For NER datasets, for example ontonotes5, you can run the following command

python utils/dataformat_preprocess.py --dataset ontonotes5 --seed 1 --base_task_entity 8 --incremental_task_entity 2 --seen_all_labels False

The program will create a target folder dataset/ontonotes5_task6_base8_inc2. We note that fixing the random seed enables that exctaly the same datasets can be generated on different devices. Finally, the post-precessed dataset clinc150_task15,fewrel_task8, and ontonotes5_task6_base8_inc2 are ready for continual learning!

Step 3: select the yaml file for hyper-parameters

The yaml file contains the hyper-parameters for each method. For example, the hyper-parameter of SEQ* (w/ and w/o pre-allocating future classifiers) for generative backbones under CIL settings is defined in config/CIL/generative_backbones/clinc150_task15/SEQ_pre_warm_fix.yaml and config/CIL/generative_backbones/clinc150_task15/SEQ_warm_fix.yaml respectively.

Step 4: reproduce the results

The scripts for reproducing the probing study are in the folder reproduce_shell/exp-probing.

The scripts for reproducing the probing study with different pre-training steps are in the folder reproduce_shell/exp-probing-pretraining.

The scripts for reproducing the experiments of comparing SEQ* with SOTA methods are in the folder reproduce_shell/exp-sota.

If you want to run an experiment, execute the main_CL.py. For example, you can run SEQ method on clinc150_task15 dataset with bert-base-cased using the following command:

python main_CL.py --exp_prefix {your-experiment-name} --cfg './config/clinc150_task15/SEQ_full.yaml' --backbone bert-base-cased --classifier Linear --training_epochs 5

If you want to use wandb for logging (see here for more help):

python main_CL.py --is_wandb True --wandb_project {your-project-name} --wandb_entity {your-entity-name} --exp_prefix {your-experiment-name} --cfg './config/clinc150_task15/SEQ_full.yaml' --backbone bert-base-cased --classifier Linear --training_epochs 5

If you want to use accelerate for data/model parallel (see here for more help):

accelerate launch --config_file {your-accelerate-config-file} main_CL.py --is_wandb True --wandb_project {your-project-name} --wandb_entity {your-entity-name} --exp_prefix {your-experiment-name} --cfg './config/clinc150_task15/SEQ_full.yaml' --backbone bert-base-cased --classifier Linear --training_epochs 5

Please refer to utils/config.py for more general paramters and models/{model-name}.py for more model-specific parameters.

Main Results

The results on IIL scenario. main_results

The results on CIL and TIL scenario. main_results

main_results

Questions and Citation

If you have questions about this repository, please feel free to contact me at junhaozheng47@outlook.com.

If you find this repository useful, please consider citing our paper.

@misc{zheng2023learn,
      title={Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models}, 
      author={Junhao Zheng and Shengjie Qiu and Qianli Ma},
      year={2023},
      eprint={2312.07887},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@article{qiu2024incremental,
  title={Incremental Sequence Labeling: A Tale of Two Shifts},
  author={Qiu, Shengjie and Zheng, Junhao and Liu, Zhen and Luo, Yicheng and Ma, Qianli},
  journal={arXiv preprint arXiv:2402.10447},
  year={2024}
}

@misc{zheng2024concept1k,
      title={Concept-1K: A Novel Benchmark for Instance Incremental Learning}, 
      author={Junhao Zheng and Shengjie Qiu and Qianli Ma},
      year={2024},
      eprint={2402.08526},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Awesome

[ACL 2024] A Codebase for Incremental Learning with Large Language Models

Contents

Introduction

Supported List

Scenario

Tasks

Methods

General (Text/Intent) Classification

Named Entity Recognition

Original for Image Classification

Datasets

Instance Incremental Learning

Intent Classification

Intent Classification

Relation Extraction

Named Entity Recognition

Best Practice to Use this Codebase

How to reproduce the performance of SEQ and SEQ*?

Usage

Overview

Quick Start

Step 1: prepare the environment

Step 2: prepare the dataset

Step 3: select the yaml file for hyper-parameters

Step 4: reproduce the results

Main Results

Questions and Citation