Home

Awesome

Learning where to learn - Gradient sparsity in meta and continual learning

In this paper, we investigate gradient sparsity found by MAML in various continual and few-shot learning scenarios.
Instead of only learning the initialization of neural network parameters, we additionally meta-learn parameters underneath a step function that stops gradient descent when smaller then 0.

We term this version Sparse-MAML - Link to the paper here.

<img src="https://github.com/Johswald/learning_where_to_learn/blob/main/utils/images/sparse_MAML.gif?raw=true" width="75%">

Interestingly, we see that structured sparsity emerges in both the classic 4-layer ConvNet as well as a ResNet-12 for few-shot learning. This is accompanied by improved robustness and generalisation across many hyperparameters.

<img src="https://github.com/Johswald/learning_where_to_learn/blob/main/utils/images/image.jpg?raw=true" width="75%">

Note that Sparse-MAML is an extremely simple variant of MAML that possesses only the possibility to shut on/off training of specific parameters compared to proper gradient modulation.

This codebase implents the few-shot learning experiments that are presented in the paper. To reproduce the results in the paper, please follow these instructions:

Installation

#1. Install a conda env:

conda create -n sparse-MAML

#2. Activate the env:

source activate sparse-MAML

#3. Install anaconda:

conda install anaconda

#4. Install extra requiremetns (make sure you use the correct pip3):

pip3 install -r requirements.txt

#5. Run:

chmod u+x run_sparse_MAML.sh

#6. Execute:

./run_sparse_MAML.sh

Results

MiniImageNet Few-ShotMAMLANILBOILsparse-MAMLsparse-ReLU-MAML
5-way 5-shot | ConvNet63.1561.5066.4567.0366.80
5-way 1-shot | ConvNet48.0746.7049.6150.3550.15
5-way 5-shot | ResNet1269.3670.0370.5070.0273.01
5-way 1-shot | ResNet1253.9155.25-55.0256.39

BOIL results are taken from the original paper.


This code based is heavily build on top of torchmeta.