Home

Awesome

Transformer-Patcher

This repo contains the code and data of ICLR 2023 accepted paper:

Transformer-Patcher: One mistake worth one neuron. Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong.

Steps for running Transformer-Patcher

Directory Structure

data/ --> data folder including data of FEVER and zsRE
scripts/ --> folder containing main.py to run groups of experiments
src/ --> folder containing classes used for loading data and constructing model
requirements.txt --> contains required packages

Setup

Environment

Our codes are based on Python 3.8.10. Other versions may work as well.

Create a virtual environment and install the dependencies (conda can help you do this) :

$ conda create --name T-Patcher python=3.8.10
$ conda activate T-Patcher
(T-Patcher)$ pip install -r requirements.txt

Data

You can download the supplementary file on this openreview page.

The data used is already in data.zip file, please unzip this file and implement the following pre-processing steps:

(T-Patcher)$ python src/dataset/fever_dataloader.py
(T-Patcher)$ python src/dataset/zsre_dataloader.py

Afterwards the original data is randomly split and we obtain a training set, a validation set, a edit set and a test set.

Running the code

Training initial models

Before conducting Sequential Model Editing, we first need an initial model.

For fine-tuning a BERT base model on FEVER dataset, you can run:

(T-Patcher)$ python scripts/train_bert_fever.py

For fine-tuning a BART base model on zsRE dataset, you can run:

(T-Patcher)$ python scripts/train_bart_seq2seq.py

Then the initial model is trained and its checkpoint is saved in log/models/bert_binary/version_0/checkpoints or log/models/bart_seq2seq/version_0/checkpoints

Running Transformer-Patcher

Running Transformer-Patcher requires several arguments:

(T-Patcher)$ python scripts/main.py --task=$TASK --edit_folder_num=$EDIT_FOLDER_NUM --process_folders=$PROCESS_FOLDERS --model_path=$MODEL_PATH