Awesome

A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm (UDIL)

This repo (built upon the amazing codebase of mammoth) contains the code for our NeurIPS 2023 paper: A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm Haizhou Shi, Hao Wang Thirty-seventh Conference on Neural Information Processing Systems, 2023 [Paper] [OpenReview] [Slides] [Talk (Youtube)] [Talk (Bilibili)]

How does UDIL Unify Existing Methods?

Long story short, in the paper, we start by re-iterating the learning objective of domain-incremental learning (which is also true for other types of continual learning). Then we propose to combine three ways of upper bounding the past-domain error (ERM, intra-domain bound, and cross-domain bound, see Chapter 3 in the paper) and assign adaptive coefficients to each of the upper bound training terms.

Here is the main theorem of our paper, which not only leads to the unification of the current domain-incremental learning methods, but allows for the possibility of minimizing a tighter bound in the next chapter.

The first main argument of our work is that, by fixating the value of the coefficients $\Omega={\alpha_i, \beta_i, \gamma_i}$, the UDIL framework can exactly correspond to some of the exisiting methods, when some conditions need to be satisfied. Here we show the final unification result derived for you (refer to Appendix B in the paper).

How does UDIL Lead to a Tighter Bound?

A natural question following the unification is: can we do better than using a single set of fixed coefficients to train a domain-incremental learning model? The answer is a firmly YES. And what we do in this work is to parameterize the coefficients, and try to optimize a tighter bound by adjusting them during model training. We know you are in a hurry, so here we will give an extremely brief review of what we do to form the final training objective.

<img src="fig/udil-objective.png" alt="" data-canonical-src="fig/udil-objective.png" width="80%"/> As you can see, there are in total four kinds of differentiable loss terms in our proposed algorithm:

🔵 Cross-Entropy Classification Loss: it corresponds to the simple ERM terms on the current data and the memory.
🟢 Cross-Entropy Distillation Loss: it corresponds to the distillation loss terms between the current model $h$ and the history model $H_{t-1}$, computed on the current data and the memory.
🔴 Adversarial Feature Alignment Loss: it corresponds to the divergence terms between the current data distribution and the past data distribution. If you are interested in how minimizing this term on the feature space can improve the performance in general, please refer to the amazing work "A theory of learning from different domains".
⚪ Adaptive Coefficient Optimization: it corresponds to estimating the error (classification accuracy) of each term, and adaptively minimizing the coefficient set $\Omega={\alpha_i, \beta_i, \gamma_i}$.

Installing the Required Packages

conda create -n udil python=3.9
conda activate udil
conda install pytorch==1.12 torchvision cudatoolkit=11.3 -c pytorch
conda install wandb ipdb -c conda-forge

Code for Running UDIL

Before you run the code, there are a couple of settings you might want to modify:

wandb_entity: at utils/args.py line 70, change to your own wandb account;
data_path and base_path: at utils/conf line 13-23, change to whatever path you want to store your data and local training logs.

We have provided the command to run UDIL in the /scripts folders, for different datasets. Once you are done with setting up everything, a quick example of running UDIL on Permutated-MNIST is shown as follows:

chmod +x scripts/*.sh
scripts/pmnist.sh

This script will start a UDIL training process and log everything on your wandb repository.

If you are in a hurry, and want to just take a quick review on the training process and final results of UDIL on three different realistic datasets (Permutated-MNIST, Rotated-MNIST, and Seq-CORe50), you can check out the following public UDIL wandb project, where we viusalized everything you might care for you!

Quantitative Results

Here we provide some quantitative results of UDIL.

Qualitative Results

Here we provide some qualitative results of UDIL, which come from the public UDIL wandb project, and we only show the results on Rotated-MNIST data.

Accuracy Matrix after 20-Domain Training <img src="fig/acc_matrix.png" alt="" data-canonical-src="fig/acc_matrix.png" width="80%"/>

Below are the visualization of embedding distributions of different classes & domains, where:

Left: colors represent different true classes;
Middle: colors represent different predicted classes by the model;
Right: colors represent different domains.

Embedding Space Visualization after 1-Domain Training <img src="fig/embedding1.png" alt="" data-canonical-src="fig/embedding1.png" width="90%"/> Embedding Space Visualization after 20-Domain Training <img src="fig/embedding2.png" alt="" data-canonical-src="fig/embedding2.png" width="90%"/>

Also Check Our Relevant Work on Domain Adaptation

[1] Domain-Indexing Variational Bayes: Interpretable Domain Index for Domain Adaptation Zihao Xu*, Guang-Yuan Hao*, Hao He, Hao Wang Eleventh International Conference on Learning Representations, 2023 [Paper] [OpenReview] [PPT] [Talk (Youtube)] [Talk (Bilibili)]

[2] Graph-Relational Domain Adaptation Zihao Xu, Hao He, Guang-He Lee, Yuyang Wang, Hao Wang Tenth International Conference on Learning Representations (ICLR), 2022 [Paper] [Code] [Talk] [Slides]

[3] Continuously Indexed Domain Adaptation Hao Wang*, Hao He*, Dina Katabi Thirty-Seventh International Conference on Machine Learning (ICML), 2020 [Paper] [Code] [Talk] [Blog] [Slides] [Website]

References

A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm

@inproceedings{UDIL,
  title={A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm},
  author={Shi, Haizhou and Wang, Hao},
  booktitle={Advances in Neural Information Processing Systems},
  year={2023}
}