Awesome
Adaptive-Compositional-Modules
Code for the ACL 2022 paper
Yanzhe Zhang, Xuezhi Wang and Diyi Yang: Continual Sequence Generation with Adaptive Compositional Modules
@misc{zhang2022continual,
title={Continual Sequence Generation with Adaptive Compositional Modules},
author={Yanzhe Zhang and Xuezhi Wang and Diyi Yang},
year={2022},
eprint={2203.10652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Data
The data processing process of this paper is following LAMOL and L2KD. You can download the preprocess datasets in their repo.
If you want to run your own datasets, please following their guidelines to prepare the data.
Environment
Requirements.txt provide some dependencies. All experiments in this paper are run on 2080 Ti with PyTorch 1.9.
(Note that the authors themselves observe that the same code might get different numbers on different devices/library versions, and the findings in the paper still hold.)
Note that the folder mytransformers
contains the 2.0 version of adapter-transformers (aka AdaphterHub). We add some necessary functions to support our framework.
Setup
- Create the following two directories in wherever you want. (you can name the directories arbitrarily):
data directory
: Where the dataset will be load by the model.model directory
: The place for the model to dump its outputs.
- Download the dataset using link in prior work's repo.
- Setup
env
file. - Install pyrouge manually, you might find this link useful.
- Setup other necessary customized configs.
Training and Testing
- Follow guidelines in prior work's repo: LAMOL and L2KD.
- We provide examples in
LAMOL.sh
andLAMOL_myadaptor.sh
. - We also provide the details of different hyper parameters in
LAMOL.sh
andLAMOL_myadaptor.sh
.
Tips from the authors
- We add a lot of args in
settings.py
andsettings_myadaptor.py
. Many of them are not used in our paper, but might be used somewhere in the code (without functioning). For used args we add on the original implementation of LAMOL, we addhelp
to help you know the role of those args. - Our code is based on two prior repos: (i) LAMOL and its following work (LAMOL and L2KD) (ii) adapter-transformer 2.0. Here are some suggestions to understand our code:
- For training and testing logic, the pattern of LAMOL, try to first read the code from LAMOL.
- For how to add and use adapter module, try to first read the source code/framework of adapterhub to have basic understanding of how they implement adding adapters and set training adapters.
Acknowledgement
- We adapt the code of LAMOL and L2KD. Huge thanks to our open-source prior work!!!
- We adapt the code of AdapterHub (Version 2.0). Huge thanks!!!
(Copy from their acknowledgement as follow:)
- We use the language model offered by transformers, a state-of-the-art natural language processing models library by Thomas Wolf et al.
- The implementation of MAS follows MAS-Memory-Aware-Synapses, the Memory Aware Synapses method implementation code by Aljundi R. et al.
- The implementation of GEM follows GradientEpisodicMemory, the Gradient Episodic Memory method implementation code by Lopez-Paz, David et al.
- The implementation of fp16 (
fp16.py
,fp16util.py
) is from Megatron-LM, the ongoing research training transformer language models at scale by NVIDIA. - Data format conversion refer to decaNLP, the Natural Language Decathlon: Multitask Learning as Question Answering implementation code by Bryan McCann et al.
Citation
If you use or find this repo useful, please consider citing following papers
@misc{zhang2022continual,
title={Continual Sequence Generation with Adaptive Compositional Modules},
author={Yanzhe Zhang and Xuezhi Wang and Diyi Yang},
year={2022},
eprint={2203.10652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
also
@article{chuang2020lifelong,
title={Lifelong Language Knowledge Distillation},
author={Chuang, Yung-Sung and Su, Shang-Yu and Chen, Yun-Nung},
journal={arXiv preprint arXiv:2010.02123},
year={2020}
}
@inproceedings{sun2019lamol,
title={LAMOL: LAnguage MOdeling for Lifelong Language Learning},
author={Sun, Fan-Keng and Ho, Cheng-Hao and Lee, Hung-Yi},
booktitle={International Conference on Learning Representations},
year={2019}
}
Questions
If you have any questions about our paper and code, please contact Yanzhe Zhang via z_yanzhe AT gatech.edu
.