Awesome

MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning

This repository focuses on tools and scripts for data distillation in the context of efficient in-context learning. Our work builds upon the MetaICL codebase.

Dependencies

For data preprocessing, ensure you have datasets==1.4.0 installed. However, this version isn't compatible with the Transformers version used for training and inference.
We recommend setting up two separate environments: one for data preprocessing and another for model training/inference.

Data Preprocessing

Pretrain C4 dataset

We utilize the validation set of C4 dataset, select "en" subset of validation split. You can also check our preprocessed data on Huggingface datasets.

Meta-train and Meta-test dataset

For details on downloading and preprocessing, kindly refer to the MetaICL documentation. You can also check our preprocessed data on Huggingface datasets.

Model Checkpoint

The model checkpoint is available in Google Drive.

Data Distillation Training

Inside src directory, you will find:

dataset_distill.py - This houses both the pretrain C4 dataset class and the meta-train/meta-test dataset class.
model_distill.py- This manages the interaction between the large language model and the context distillation model.
SmallModel.py- This file contains the implementation of the context distillation model.

Pre-training:

cd scripts
sh c4_pretrain.sh

FineTuning

cd scripts
sh finetune.sh

License

MetaICL is CC-BY-NC 4.0 licensed.

Citation

If you use this code for your research, please cite our paper:

@inproceedings{
li2024mend,
title={{MEND}: Meta Demonstration Distillation for Efficient and Effective In-Context Learning},
author={Yichuan Li and Xiyao Ma and Sixing Lu and Kyumin Lee and Xiaohu Liu and Chenlei Guo},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=2Y5kBPtU0o}
}