Awesome
<p align="center"> <img width="230" height="230" src="docs/_static/logo.png" alt="logo"> </p> <p align="center"> <img alt="GitHub commit activity" src="https://img.shields.io/github/commit-activity/m/aimagelab/mammoth"> <a href="https://aimagelab.github.io/mammoth/index.html"><img alt="Static Badge" src="https://img.shields.io/badge/wiki-gray?style=flat&logo=readthedocs&link=https%3A%2F%2Faimagelab.github.io%2Fmammoth%2Findex.html"></a> <img alt="Discord" src="https://img.shields.io/discord/1164956257392799860"> </p>Mammoth - An Extendible (General) Continual Learning Framework for Pytorch
Official repository of:
- Class-Incremental Continual Learning into the eXtended DER-verse
- Dark Experience for General Continual Learning: a Strong, Simple Baseline
- Semantic Residual Prompts for Continual Learning
- CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning
Mammoth is a framework for continual learning research. With more than 40 methods and 20 datasets, it includes the most complete list competitors and benchmarks for research purposes.
The core idea of Mammoth is that it is designed to be modular, easy to extend, and - most importantly - easy to debug. Ideally, all the code necessary to run the experiments is included in the repository, without needing to check out other repositories or install additional packages.
With Mammoth, nothing is set in stone. You can easily add new models, datasets, training strategies, or functionalities.
All the models included in mammoth are verified against the original papers (or subsequent relevant papers) to reproduce their original results.
Documentation
Check out the official DOCUMENTATION for more information on how to use Mammoth!
<p align="center"> <img width="112" height="112" src="docs/_static/seq_mnist.gif" alt="Sequential MNIST"> <img width="112" height="112" src="docs/_static/seq_cifar10.gif" alt="Sequential CIFAR-10"> <img width="112" height="112" src="docs/_static/seq_tinyimg.gif" alt="Sequential TinyImagenet"> <img width="112" height="112" src="docs/_static/perm_mnist.gif" alt="Permuted MNIST"> <img width="112" height="112" src="docs/_static/rot_mnist.gif" alt="Rotated MNIST"> <img width="112" height="112" src="docs/_static/mnist360.gif" alt="MNIST-360"> </p>Setup
- Install with
pip install -r requirements.txt
. NOTE: Pytorch version >= 2.1.0 is required for scaled_dot_product_attention (see: https://github.com/Lightning-AI/litgpt/issues/763). If you cannot support this requirement, uncomment the lines 136-139 underscaled_dot_product_attention
inbackbone/vit.py
. - Use
./utils/main.py
to run experiments. - New models can be added to the
models/
folder. - New datasets can be added to the
datasets/
folder.
Update roadmap
All the code is under active development. Here are some of the features we are working on:
- Configurations for datasets: Currently, each dataset represents a specific configuration (e.g., number of tasks, data augmentations, backbone, etc.). This makes adding a new setting a bit cumbersome. We are working on a more flexible way to define configurations, while leaving the current system as a default for retro-compatibility.
- New models: We are working on adding new models to the repository.
- New training modalities: We will introduce new CL training regimes, such as training with noisy labels, regression, segmentation, detection, etc.
- Openly accessible result dashboard: We are working on a dashboard to visualize the results of all the models in both their respective settings (to prove their reproducibility) and in a general setting (to compare them). This may take some time, since compute is not free.
All the new additions will try to preserve the current structure of the repository, making it easy to add new functionalities with a simple merge.
Models
Mammoth currently supports more than 50 models, with new releases covering the main competitors in literature.
- Efficient Lifelong Learning with A-GEM (A-GEM, A-GEM-R - A-GEM with reservoir buffer):
agem
,agem_r
. - AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning (AttriCLIP):
attriclip
. - Bias Correction (BiC):
bic
. - Continual Contrastive Interpolation Consistency (CCIC) - Requires
pip install kornia
:ccic
. - Continual Generative training for Incremental prompt-Learning (CGIL):
cgil
- Contrastive Language-Image Pre-Training (CLIP):
clip
(static method with no learning). - CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning (CODA-Prompt) - Requires
pip install timm==0.9.8
:coda-prompt
. - Generating Instance-level Prompts for Rehearsal-free Continual Learning (DAP):
dap
. - Dark Experience for General Continual Learning: a Strong, Simple Baseline (DER & DER++):
der
andderpp
. - DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning (DualPrompt) - Requires
pip install timm==0.9.8
:dualprompt
. - Experience Replay (ER):
er
. - Experience Replay with Asymmetric Cross-Entropy (ER-ACE):
er_ace
. - May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels (AER & ABS):
er_ace_aer_abs
. - Rethinking Experience Replay: a Bag of Tricks for Continual Learning (ER-ACE with tricks):
er_ace_tricks
. - online Elastic Weight Consolidation (oEWC):
ewc_on
. - Function Distance Regularization (FDR):
fdr
. - Greedy Sampler and Dumb Learner (GDumb):
gdumb
. - Gradient Episodic Memory (GEM) - Unavailable on windows:
gem
. - Greedy gradient-based Sample Selection (GSS):
gss
. - Hindsight Anchor Learning (HAL):
hal
. - Incremental Classifier and Representation Learning (iCaRL):
icarl
. - Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS (IDEFICS):
idefics
(static method with no learning). - Joint training for the General Continual setting:
joint_gcl
(only for General Continual). - Learning to Prompt (L2P) - Requires
pip install timm==0.9.8
:l2p
. - LiDER (on DER++, iCaRL, GDumb, and ER-ACE):
derpp_lider
,icarl_lider
,gdumb_lider
,er_ace_lider
. - Large Language and Vision Assistant (LLAVA):
llava
(static method with no learning). - Learning a Unified Classifier Incrementally via Rebalancing (LUCIR):
lucir
. - Learning without Forgetting (LwF):
lwf
. - Learning without Forgetting adapted for Multi-Class classification (LwF.MC):
lwf_mc
(from the iCaRL paper). - Meta-Experience Replay (MER):
mer
. - Mixture-of-Experts Adapters (MoE Adapters):
moe_adapters
. - Progressive Neural Networks (PNN):
pnn
. - Online Continual Learning on a Contaminated Data Stream with Blurry Task Boundaries (PuriDivER):
puridiver
. - Random Projections and Pre-trained Models for Continual Learning (RanPAC):
ranpac
. - Regular Polytope Classifier (RPC):
rpc
. - Synaptic Intelligence (SI):
si
. - SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model (SLCA) - Requires
pip install timm==0.9.8
:slca
. - Slow Learner with Classifier Alignment (SLCA):
slca
. - Semantic Two-level Additive Residual Prompt (STAR-Prompt):
starprompt
. Also includes the first-stage only (first_stage_starprompt
) and second-stage only (second_stage_starprompt
) versions. - Transfer without Forgetting (TwF):
twf
. - eXtended-DER (X-DER):
xder
(full version),xder_ce
(X-DER with CE),xder_rpc
(X-DER with RPC).
Datasets
NOTE: Datasets are automatically downloaded in data/
.
- This can be changed by changing the
base_path
function inutils/conf.py
or using the--base_path
argument. - The
data/
folder should not be tracked by git and is created automatically if missing.
Mammoth currently includes 21 datasets, covering toy classification problems (different versions of MNIST), standard domains (CIFAR, Imagenet-R, TinyImagenet, MIT-67), fine-grained classification domains (Cars-196, CUB-200), aerial domains (EuroSAT-RGB, Resisc45), medical domains (CropDisease, ISIC, ChestX).
- Sequential MNIST (Class-Il / Task-IL):
seq-mnist
. - Permuted MNIST (Domain-IL):
perm-mnist
. - Rotated MNIST (Domain-IL):
rot-mnist
. - MNIST-360 (General Continual Learning):
mnist-360
. - Sequential CIFAR-10 (Class-Il / Task-IL):
seq-cifar10
. - Sequential CIFAR-10 resized 224x224 (ViT version) (Class-Il / Task-IL):
seq-cifar10-224
. - Sequential CIFAR-10 resized 224x224 (ResNet50 version) (Class-Il / Task-IL):
seq-cifar10-224-rs
. - Sequential Tiny ImageNet (Class-Il / Task-IL):
seq-tinyimg
. - Sequential Tiny ImageNet resized 32x32 (Class-Il / Task-IL):
seq-tinyimg-r
. - Sequential CIFAR-100 (Class-Il / Task-IL):
seq-cifar100
. - Sequential CIFAR-100 resized 224x224 (ViT version) (Class-Il / Task-IL):
seq-cifar100-224
. - Sequential CIFAR-100 resized 224x224 (ResNet50 version) (Class-Il / Task-IL):
seq-cifar100-224-rs
. - Sequential CUB-200 (Class-Il / Task-IL):
seq-cub200
. - Sequential ImageNet-R (Class-Il / Task-IL):
seq-imagenet-r
. - Sequential Cars-196 (Class-Il / Task-IL):
seq-cars196
. - Sequential RESISC45 (Class-Il / Task-IL):
seq-resisc45
. - Sequential EuroSAT-RGB (Class-Il / Task-IL):
seq-eurosat-rgb
. - Sequential ISIC (Class-Il / Task-IL):
seq-isic
. - Sequential ChestX (Class-Il / Task-IL):
seq-chestx
. - Sequential MIT-67 (Class-Il / Task-IL):
seq-mit67
. - Sequential CropDisease (Class-Il / Task-IL):
seq-cropdisease
.
Pretrained backbones
- ResNet18 on cifar100
- ResNet18 on TinyImagenet resized (seq-tinyimg-r)
- ResNet50 on ImageNet (pytorch version)
- ResNet18 on SVHN
Citing these works
@article{boschini2022class,
title={Class-Incremental Continual Learning into the eXtended DER-verse},
author={Boschini, Matteo and Bonicelli, Lorenzo and Buzzega, Pietro and Porrello, Angelo and Calderara, Simone},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2022},
publisher={IEEE}
}
@inproceedings{buzzega2020dark,
author = {Buzzega, Pietro and Boschini, Matteo and Porrello, Angelo and Abati, Davide and Calderara, Simone},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
pages = {15920--15930},
publisher = {Curran Associates, Inc.},
title = {Dark Experience for General Continual Learning: a Strong, Simple Baseline},
volume = {33},
year = {2020}
}
Awesome Papers using Mammoth
Our Papers
Expand to see the BibTex!
<ul> <li><details><summary>CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning (<b>BMVC 2024</b>) <a href=https://arxiv.org/abs/2407.15793>paper</a></summary> <pre><code>@inproceedings{heng2022enhancing, title={CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning}, author={Frascaroli, Emanuele and Panariello, Aniello and Buzzega, Pietro and Bonicelli, Lorenzo and Porrello, Angelo and Calderara, Simone}, booktitle={35th British Machine Vision Conference}, year={2024} }</code></pre> </li> <li><details><summary>Semantic Residual Prompts for Continual Learning (<b>ECCV 2024</b>) <a href=https://arxiv.org/abs/2403.06870>paper</a></summary> <pre><code>@inproceedings{menabue2024semantic, title={Semantic Residual Prompts for Continual Learning}, author={Menabue, Martin and Frascaroli, Emanuele and Boschini, Matteo and Sangineto, Enver and Bonicelli, Lorenzo and Porrello, Angelo and Calderara, Simone}, booktitle={18th European Conference on Computer Vision}, year={202}, organization={Springer} }</code></pre> </li> <li><details><summary>Mask and Compress: Efficient Skeleton-based Action Recognition in Continual Learning (<b>ICPR 2024</b>) <a href=https://arxiv.org/pdf/2407.01397>paper</a> <a href=https://github.com/Sperimental3/CHARON>code</a></summary> <pre><code>@inproceedings{mosconi2024mask, title={Mask and Compress: Efficient Skeleton-based Action Recognition in Continual Learning}, author={Mosconi, Matteo and Sorokin, Andriy and Panariello, Aniello and Porrello, Angelo and Bonato, Jacopo and Cotogni, Marco and Sabetta, Luigi and Calderara, Simone and Cucchiara, Rita}, booktitle={International Conference on Pattern Recognition}, year={2024} }</code></pre> </li> <li><details><summary>On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning (<b>NeurIPS 2022</b>) <a href=https://arxiv.org/abs/2210.06443>paper</a> <a href=https://github.com/aimagelab/lider>code</a> (Also available here)</summary> <pre><code>@article{bonicelli2022effectiveness, title={On the effectiveness of lipschitz-driven rehearsal in continual learning}, author={Bonicelli, Lorenzo and Boschini, Matteo and Porrello, Angelo and Spampinato, Concetto and Calderara, Simone}, journal={Advances in Neural Information Processing Systems}, volume={35}, pages={31886--31901}, year={2022} }</code></pre> </li> <li><details><summary>Continual semi-supervised learning through contrastive interpolation consistency (<b>PRL 2022</b>) <a href=https://arxiv.org/abs/2108.06552>paper</a> <a href=https://github.com/aimagelab/CSSL>code</a> (Also available here)</summary> <pre><code>@article{boschini2022continual, title={Continual semi-supervised learning through contrastive interpolation consistency}, author={Boschini, Matteo and Buzzega, Pietro and Bonicelli, Lorenzo and Porrello, Angelo and Calderara, Simone}, journal={Pattern Recognition Letters}, volume={162}, pages={9--14}, year={2022}, publisher={Elsevier} }</code></pre> </li> <li><details><summary>Transfer without Forgetting (<b>ECCV 2022</b>) <a href=https://arxiv.org/abs/2206.00388>paper</a> <a href=https://github.com/mbosc/twf>code</a> (Also available here)</summary> <pre><code>@inproceedings{boschini2022transfer, title={Transfer without forgetting}, author={Boschini, Matteo and Bonicelli, Lorenzo and Porrello, Angelo and Bellitto, Giovanni and Pennisi, Matteo and Palazzo, Simone and Spampinato, Concetto and Calderara, Simone}, booktitle={17th European Conference on Computer Vision}, pages={692--709}, year={2022}, organization={Springer} }</code></pre> </li> <li><details><summary>Effects of Auxiliary Knowledge on Continual Learning (<b>ICPR 2022</b>) <a href=https://arxiv.org/abs/2206.02577>paper</a></summary> <pre><code>@inproceedings{bellitto2022effects, title={Effects of auxiliary knowledge on continual learning}, author={Bellitto, Giovanni and Pennisi, Matteo and Palazzo, Simone and Bonicelli, Lorenzo and Boschini, Matteo and Calderara, Simone}, booktitle={26th International Conference on Pattern Recognition}, pages={1357--1363}, year={2022}, organization={IEEE} }</code></pre> </li> <li><details><summary>Class-Incremental Continual Learning into the eXtended DER-verse (<b>TPAMI 2022</b>) <a href=https://arxiv.org/abs/2201.00766>paper</a></summary> <pre><code>@article{boschini2022class, title={Class-Incremental Continual Learning into the eXtended DER-verse}, author={Boschini, Matteo and Bonicelli, Lorenzo and Buzzega, Pietro and Porrello, Angelo and Calderara, Simone}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, year={2022}, publisher={IEEE} }</code></pre> </li> <li><details><summary>Rethinking Experience Replay: a Bag of Tricks for Continual Learning (<b>ICPR 2020</b>) <a href=https://arxiv.org/abs/2010.05595>paper</a> <a href=https://github.com/hastings24/rethinking_er>code</a></summary> <pre><code>@inproceedings{buzzega2021rethinking, title={Rethinking experience replay: a bag of tricks for continual learning}, author={Buzzega, Pietro and Boschini, Matteo and Porrello, Angelo and Calderara, Simone}, booktitle={25th International Conference on Pattern Recognition}, pages={2180--2187}, year={2021}, organization={IEEE} }</code></pre> </li> <li><details><summary>Dark Experience for General Continual Learning: a Strong, Simple Baseline (<b>NeurIPS 2020</b>) <a href=https://arxiv.org/abs/2004.07211>paper</a></summary> <pre><code>@inproceedings{buzzega2020dark, author = {Buzzega, Pietro and Boschini, Matteo and Porrello, Angelo and Abati, Davide and Calderara, Simone}, booktitle = {Advances in Neural Information Processing Systems}, editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin}, pages = {15920--15930}, publisher = {Curran Associates, Inc.}, title = {Dark Experience for General Continual Learning: a Strong, Simple Baseline}, volume = {33}, year = {2020} }</code></pre> </details> </li> </ul>Other Awesome CL works using Mammoth
Get in touch if we missed your awesome work!
- Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method (ICML 2024) [paper] [code]
- AGILE - Mitigating Interference in Incremental Learning through Attention-Guided Rehearsal (CoLLAs 2024) [paper] [code]
- Interactive Continual Learning (ICL) (CVPR 2024) [paper] [code]
- Prediction Error-based Classification for Class-Incremental Learning (ICLR 2024) [paper] [code]
- TriRE: A Multi-Mechanism Learning Paradigm for Continual Knowledge Retention and Promotion (NeurIPS 2023) [paper] [code]
- Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation (NeurIPS 2023) [paper] [code]
- A Unified and General Framework for Continual Learning (ICLR 2024) [paper] [code]
- Decoupling Learning and Remembering: a Bilevel Memory Framework with Knowledge Projection for Task-Incremental Learning (CVPR 2023) [paper] [code]
- Regularizing Second-Order Influences for Continual Learning (CVPR 2023) [paper] [code]
- Sparse Coding in a Dual Memory System for Lifelong Learning (CVPR 2023) [paper] [code]
- A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm (CVPR 2023) [paper] [code]
- A Multi-Head Model for Continual Learning via Out-of-Distribution Replay (CVPR 2023) [paper] [code]
- Preserving Linear Separability in Continual Learning by Backward Feature Projection (CVPR 2023) [paper] [code]
- Complementary Calibration: Boosting General Continual Learning With Collaborative Distillation and Self-Supervision (TIP 2023) [paper] [code]
- Continual Learning by Modeling Intra-Class Variation (TMLR 2023) [paper] [code]
- ConSlide: Asynchronous Hierarchical Interaction Transformer with Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis (ICCV 2023) [paper] [code]
- CBA: Improving Online Continual Learning via Continual Bias Adaptor (ICCV 2023) [paper] [code]
- Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal (ICML 2023) [paper] [code]
- Learnability and Algorithm for Continual Learning (ICML 2023) [paper] [code]
- Pretrained Language Model in Continual Learning: a Comparative Study (ICLR 2022) [paper] [code]
- Representational continuity for unsupervised continual learning (ICLR 2022) [paper] [code]
- Continual Normalization: Rethinking Batch Normalization for Online Continual Learning (ICLR 2022) [paper] [code]
- Learning Fast, Learning Slow: A General Continual Learning Method based on Complementary Learning System (ICLR 2022) [paper] [code]
- New Insights on Reducing Abrupt Representation Change in Online Continual Learning (ICLR 2022) [paper] [code]
- Looking Back on Learned Experiences for Class/Task Incremental Learning (ICLR 2022) [paper] [code]
- Task Agnostic Representation Consolidation: a Self-supervised based Continual Learning Approach (CoLLAs 2022) [paper] [code]
- Consistency is the key to further Mitigating Catastrophic Forgetting in Continual Learning (CoLLAs 2022) [paper] [code]
- Self-supervised models are continual learners (CVPR 2022) [paper] [code]
- Learning from Students: Online Contrastive Distillation Network for General Continual Learning (IJCAI 2022) [paper] [code]
Contributing
Pull requests welcome!
Please use autopep8
with parameters:
--aggressive
--max-line-length=200
--ignore=E402
Previous versions
If you're interested in a version of this repo that only includes the original code for Dark Experience for General Continual Learning: a Strong, Simple Baseline or Class-Incremental Continual Learning into the eXtended DER-verse, please use the following tags:
- neurips2020 for DER (NeurIPS 2020).
- tpami2023 for X-DER (TPAMI 2022).