Awesome
A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm (UDIL)
This repo (built upon the amazing codebase of mammoth) contains the code for our NeurIPS 2023 paper:<br> A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm<br> Haizhou Shi, Hao Wang<br> Thirty-seventh Conference on Neural Information Processing Systems, 2023<br> [Paper] [OpenReview] [Slides] [Talk (Youtube)] [Talk (Bilibili)]
<p align="center"> <img src="fig/udil-overview.png" alt="" data-canonical-src="fig/udil-overview.png" width="100%"/> </p>Outline
- How does UDIL unify existing methods?
- How does UDIL lead to a tighter bound?
- Installing the required packages
- Code for running UDIL
- Quantitative Results
- Related Work
- References
How does UDIL Unify Existing Methods?
Long story short, in the paper, we start by re-iterating the learning objective of domain-incremental learning (which is also true for other types of continual learning). Then we propose to combine three ways of upper bounding the past-domain error (ERM, intra-domain bound, and cross-domain bound, see Chapter 3 in the paper) and assign adaptive coefficients to each of the upper bound training terms.
Here is the main theorem of our paper, which not only leads to the unification of the current domain-incremental learning methods, but allows for the possibility of minimizing a tighter bound in the next chapter.
<p align="center"> <img src="fig/thm.png" alt="" data-canonical-src="fig/thm.png" width="80%"/> </p>The first main argument of our work is that, by fixating the value of the coefficients $\Omega={\alpha_i, \beta_i, \gamma_i}$, the UDIL framework can exactly correspond to some of the exisiting methods, when some conditions need to be satisfied. Here we show the final unification result derived for you (refer to Appendix B in the paper).
<p align="center"> <img src="fig/unification.png" alt="" data-canonical-src="fig/unification.png" width="80%"/> </p>How does UDIL Lead to a Tighter Bound?
A natural question following the unification is: can we do better than using a single set of fixed coefficients to train a domain-incremental learning model? The answer is a firmly YES. And what we do in this work is to parameterize the coefficients, and try to optimize a tighter bound by adjusting them during model training. We know you are in a hurry, so here we will give an extremely brief review of what we do to form the final training objective.
<p align="center"> <img src="fig/udil-objective.png" alt="" data-canonical-src="fig/udil-objective.png" width="80%"/> </p> As you can see, there are in total four kinds of differentiable loss terms in our proposed algorithm:- 🔵 Cross-Entropy Classification Loss: it corresponds to the simple <span style="color:blue">ERM terms</span> on the current data and the memory.
- 🟢 Cross-Entropy Distillation Loss: it corresponds to the <span style="color:green">distillation loss terms</span> between the current model $h$ and the history model $H_{t-1}$, computed on the current data and the memory.
- 🔴 Adversarial Feature Alignment Loss: it corresponds to the <span style="color:red">divergence terms</span> between the current data distribution and the past data distribution. If you are interested in how minimizing this term on the feature space can improve the performance in general, please refer to the amazing work "A theory of learning from different domains".
- ⚪ Adaptive Coefficient Optimization: it corresponds to estimating the error (classification accuracy) of each term, and adaptively minimizing the <span style="color:gray">coefficient set</span> $\Omega={\alpha_i, \beta_i, \gamma_i}$.
Installing the Required Packages
conda create -n udil python=3.9
conda activate udil
conda install pytorch==1.12 torchvision cudatoolkit=11.3 -c pytorch
conda install wandb ipdb -c conda-forge
Code for Running UDIL
Before you run the code, there are a couple of settings you might want to modify:
wandb_entity
: atutils/args.py
line 70, change to your own wandb account;data_path
andbase_path
: atutils/conf
line 13-23, change to whatever path you want to store your data and local training logs.
We have provided the command to run UDIL in the /scripts
folders, for different datasets.
Once you are done with setting up everything, a quick example of running UDIL on Permutated-MNIST
is shown as follows:
chmod +x scripts/*.sh
scripts/pmnist.sh
This script will start a UDIL training process and log everything on your wandb repository.
If you are in a hurry, and want to just take a quick review on the training process and final results of UDIL on three different realistic datasets (Permutated-MNIST, Rotated-MNIST, and Seq-CORe50), you can check out the following public UDIL wandb project, where we viusalized everything you might care for you!
Quantitative Results
Here we provide some quantitative results of UDIL.
<p align="center"> <img src="fig/table-pmnist.png" alt="" data-canonical-src="fig/table-pmnist.png" width="90%"/> </p> <p align="center"> <img src="fig/table-rmnist.png" alt="" data-canonical-src="fig/table-rmnist.png" width="90%"/> </p> <p align="center"> <img src="fig/table-core50.png" alt="" data-canonical-src="fig/table-core50.png" width="90%"/> </p>Qualitative Results
Here we provide some qualitative results of UDIL, which come from the public UDIL wandb project, and we only show the results on Rotated-MNIST data.
<p align="center"> <span>Accuracy Matrix after 20-Domain Training</span> <img src="fig/acc_matrix.png" alt="" data-canonical-src="fig/acc_matrix.png" width="80%"/> </p>Below are the visualization of embedding distributions of different classes & domains, where:
- Left: colors represent different true classes;
- Middle: colors represent different predicted classes by the model;
- Right: colors represent different domains.
Also Check Our Relevant Work on Domain Adaptation
<span id="paper_1">[1] Domain-Indexing Variational Bayes: Interpretable Domain Index for Domain Adaptation<br></span> Zihao Xu*, Guang-Yuan Hao*, Hao He, Hao Wang<br> Eleventh International Conference on Learning Representations, 2023<br> [Paper] [OpenReview] [PPT] [Talk (Youtube)] [Talk (Bilibili)]
<span id="paper_2">[2] Graph-Relational Domain Adaptation<br></span> Zihao Xu, Hao He, Guang-He Lee, Yuyang Wang, Hao Wang<br> Tenth International Conference on Learning Representations (ICLR), 2022<br> [Paper] [Code] [Talk] [Slides]
<span id="paper_3">[3] Continuously Indexed Domain Adaptation<br></span> Hao Wang*, Hao He*, Dina Katabi<br> Thirty-Seventh International Conference on Machine Learning (ICML), 2020<br> [Paper] [Code] [Talk] [Blog] [Slides] [Website]
References
A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm
@inproceedings{UDIL,
title={A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm},
author={Shi, Haizhou and Wang, Hao},
booktitle={Advances in Neural Information Processing Systems},
year={2023}
}