Home

Awesome

Universal Representation Learning and Task-specific Adaptation for Few-shot Learning

A universal representation learning algorithm that learns a set of well-generalized representations via a single universal network from multiple diverse visual datasets and task-specific adaptation techniques for few-shot learning.

<p align="center"> <img src="./figures/fsl.png" style="width:60%"> </p>

Universal Representation Learning from Multiple Domains for Few-shot Classification,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
ICCV 2021 (arXiv 2103.13841)

Cross-domain Few-shot Learning with Task-specific Adapters,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
CVPR 2022 (arXiv 2107.00358)

Universal Representations: A Unified Look at Multiple Task and Domain Learning,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
IJCV 2023 (arXiv 2204.02744)

Updates

Features at a glance

Main results on Meta-dataset

Test DatasetsTSA (Ours)URL (Ours)MDLBest SDLtri-M [8]FLUTE [7]URT [6]SUR [5]Transductive CNAPS [4]Simple CNAPS [3]CNAPS [2]
Avg rank1.52.77.16.75.55.16.76.95.77.2-
Avg Seen80.280.076.976.374.576.276.775.275.174.671.6
Avg Unseen77.269.361.761.969.969.962.463.166.565.8-
Avg All79.075.971.170.872.773.871.270.571.871.2-
Test DatasetsTSA-ResNet34 (Ours)TSA-ResNet18 (Ours)CTX-ResNet34 [10]ProtoNet-ResNet34 [10]FLUTE [7]BOHB [9]ALFA+fo-Proto-MAML [1]fo-Proto-MAML [1]ProtoNet [1]Finetune [1]
Avg rank1.52.81.85.58.96.05.37.08.37.9
Avg Seen63.759.562.853.746.951.952.849.550.545.8
Avg Unseen76.271.975.661.153.260.062.458.456.758.2
Avg All74.970.774.360.452.659.261.457.556.157.0

Model Zoo

Dependencies

This code requires the following:

Installation

Initialization

  1. Before doing anything, first run the following commands.

    ulimit -n 50000
    export META_DATASET_ROOT=<root directory of the cloned or downloaded Meta-Dataset repository>
    export RECORDS=<the directory where tf-records of MetaDataset are stored>
    

    Note the above commands need to be run every time you open a new command shell.

  2. Enter the root directory of this project, i.e. the directory where this project was cloned or downloaded.

Universal Representation Learning from Multiple Domains for Few-shot Classification

<p align="center"> <img src="./figures/universal.png" style="width:60%"> </p> <p align="center"> Figure 1. <b>URL - Universal Representation Learning</b>. </p>

Train the Universal Representation Learning Network

  1. The easiest way is to download our pre-trained URL model and evaluate its feature using our Pre-classifier Alignment (PA). To download the pretrained URL model, one can use gdown (installed by pip install gdown) and execute the following command in the root directory of this project:

    gdown https://drive.google.com/uc?id=1Dv8TX6iQ-BE2NMpfd0sQmH2q4mShmo1A && md5sum url.zip && unzip url.zip -d ./saved_results/ && rm url.zip
    
    

    This will donwnload the URL model and place it in the ./saved_results directory. One can evaluate this model by our PA (see the Meta-Testing step)

  2. Alternatively, one can train the model from scratch: 1) train 8 single domain learning networks; 2) train the universal feature extractor as follow.

Train Single Domain Learning Networks

  1. The easiest way is to download our pre-trained models and use them to obtain a universal set of features directly. To download single domain learning networks, execute the following command in the root directory of this project:

    gdown https://drive.google.com/uc?id=1MvUcvQ8OQtoOk1MIiJmK6_G8p4h8cbY9 && md5sum sdl.zip && unzip sdl.zip -d ./saved_results/ && rm sdl.zip
    

    This will download all single domain learning models and place them in the ./saved_results directory of this project.

  2. Alternatively, instead of using the pretrained models, one can train the models from scratch. To train 8 single domain learning networks, run:

    ./scripts/train_resnet18_sdl.sh
    

Train the Universal Feature Extractor

To learn the universal feature extractor by distilling the knowledge from pre-trained single domain learning networks, run:

./scripts/train_resnet18_url.sh

Meta-Testing with Pre-classifier Alignment (PA)

<p align="center"> <img src="./figures/pa.png" style="width:80%"> </p> <p align="center"> Figure 2. <b>PA - Pre-classifier Alignment</b> for Adapting Features in Meta-test. </p>

This step would run our Pre-classifier Alignment (PA) procedure per task to adapt the features to a discriminate space and build a Nearest Centroid Classifier (NCC) on the support set to classify query samples, run:

./scripts/test_resnet18_pa.sh

Cross-domain Few-shot Learning with Task-specific Adapters

<p align="center"> <img src="./figures/tsa.png" style="width:100%"> </p> <p align="center"> Figure 3. Cross-domain Few-shot Learning with <b>Task-specific Adapters (TSA)</b>. </p>

We provide code for attaching task-specific adapters (TSA) to a single universal network learned from meta-train and learn the task-specific adapters on the support set. One can download our pre-trained URL model (see here to download the URL or SDL models or train them from scratch) and evaluate its feature adapted by residual adapters in matrix form and pre-classifier alignment, run:

./scripts/test_resnet18_tsa.sh

One may want to train the model from scratch from the Meta-training step. For single-domain learning network, see here to learn a single network from ImageNet with ResNet-18. For multi-domain learning setting, one can learn a URL model (see here) or learn a vanilla MDL model (see here). Note that, one may need to amend the input of --model.name and --model.dir in ./scripts/test_resnet18_tsa.sh to the model learned from meta-training and amend --test.mode to sdl if the backbone is learned from ImageNet only in meta-training and then run the TSA.

We also provide implementation of different options for task-specific adapters, including connection topology (serial or residual), parameterizations (matrix or channel-wise), weight initializations (identity or random). See ./scripts/test_resnet18_tsa.sh for more details. Note that, you would obtain slightly different results compared with the ones in in Table 3 in our TSA paper as mentioned in https://github.com/google-research/meta-dataset/issues/54. One can set shuffle_buffer_size to 0 in ./data/meta_dataset_reader.py to obtain the same results as in Table 3 in our TSA paper, but I strongly suggest that one should re-run the experiments using our up-to-date code (the results with shuffle_buffer_size=1000 would be slightly different from the ones with shuffle_buffer_size=0 and the rankings will be the same).

Expected Results

Below are the results extracted from our papers. The results will vary from run to run by a percent or two up or down due to the fact that the Meta-Dataset reader generates different tasks each run, randomnes in training the networks and in TSA and PA optimization. Note, the results are updated with the up-to-date evaluation from Meta-Dataset. Make sure that you use the up-to-date code from the Meta-Dataset repository to convert the dataset and set shuffle_buffer_size=1000 as mentioned in https://github.com/google-research/meta-dataset/issues/54.

Models trained on all datasets

Test DatasetsTSA (Ours)URL (Ours)MDLBest SDLtri-M [8]FLUTE [7]URT [6]SUR [5]Transductive CNAPS [4]Simple CNAPS [3]CNAPS [2]
Avg rank1.52.77.16.75.55.16.76.95.77.2-
ImageNet57.4±1.1 57.5±1.1 52.9±1.2 54.3±1.1 58.6±1.0 51.8±1.1 55.0±1.1 54.5±1.1 57.9±1.1 56.5±1.1 50.8±1.1 
Omniglot95.0±0.4 94.5±0.4 93.7±0.5 93.8±0.5 92.0±0.6 93.2±0.5 93.3±0.5 93.0±0.5 94.3±0.4 91.9±0.6 91.7±0.5 
Aircraft89.3±0.4 88.6±0.5 84.9±0.5 84.5±0.5 82.8±0.7 87.2±0.5 84.5±0.6 84.3±0.5 84.7±0.5 83.8±0.6 83.7±0.6 
Birds81.4±0.7 80.5±0.7 79.2±0.8 70.6±0.9 75.3±0.8 79.2±0.8 75.8±0.8 70.4±1.1 78.8±0.7 76.1±0.9 73.6±0.9 
Textures76.7±0.7 76.2±0.7 70.9±0.8 72.1±0.7 71.2±0.8 68.8±0.8 70.6±0.7 70.5±0.7 66.2±0.8 70.0±0.8 59.5±0.7 
Quick Draw82.0±0.6 81.9±0.6 81.7±0.6 82.6±0.6 77.3±0.7 79.5±0.7 82.1±0.6 81.6±0.6 77.9±0.6 78.3±0.7 74.7±0.8 
Fungi67.4±1.0 68.8±0.9 63.2±1.1 65.9±1.0 48.5±1.0 58.1±1.1 63.7±1.0 65.0±1.0 48.9±1.2 49.1±1.2 50.2±1.1 
VGG Flower92.2±0.5 92.1±0.5 88.7±0.6 86.7±0.6 90.5±0.5 91.6±0.6 88.3±0.6 82.2±0.8 92.3±0.4 91.3±0.6 88.9±0.5 
Traffic Sign83.5±0.9 63.3±1.2 49.2±1.0 47.1±1.1 63.0±1.0 58.4±1.1 50.1±1.1 49.8±1.1 59.7±1.1 59.2±1.0 56.5±1.1 
MSCOCO55.8±1.1 54.0±1.0 47.3±1.1 49.7±1.0 52.8±1.1 50.0±1.0 48.9±1.1 49.4±1.1 42.5±1.1 42.4±1.1 39.4±1.0 
MNIST96.7±0.4 94.5±0.5 94.2±0.4 91.0±0.5 96.2±0.3 95.6±0.5 90.5±0.4 94.9±0.4 94.7±0.3 94.3±0.4 -
CIFAR-1080.6±0.8 71.9±0.7 63.2±0.8 65.4±0.8 75.4±0.8 78.6±0.7 65.1±0.8 64.2±0.9 73.6±0.7 72.0±0.8 -
CIFAR-10069.6±1.0 62.6±1.0 54.7±1.1 56.2±1.0 62.0±1.0 67.1±1.0 57.2±1.0 57.1±1.1 61.8±1.0 60.9±1.1 -

Models trained on ImageNet only TODO

<div style="text-align:justify; font-size:80%"> <p> [1] Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle; <a href="https://arxiv.org/abs/1903.03096">Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples</a>; ICLR 2020. </p> <p> [2] James Requeima, Jonathan Gordon, John Bronskill, Sebastian Nowozin, Richard E. Turner; <a href="https://arxiv.org/abs/1906.07697">Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes</a>; NeurIPS 2019. </p> <p> [3] Peyman Bateni, Raghav Goyal, Vaden Masrani, Frank Wood, Leonid Sigal; <a href="https://openaccess.thecvf.com/content_CVPR_2020/html/Bateni_Improved_Few-Shot_Visual_Classification_CVPR_2020_paper.html">Improved Few-Shot Visual Classification</a>; CVPR 2020. </p> <p> [4] Peyman Bateni, Jarred Barber, Jan-Willem van de Meent, Frank Wood; <a href="https://openaccess.thecvf.com/content/WACV2022/papers/Bateni_Enhancing_Few-Shot_Image_Classification_With_Unlabelled_Examples_WACV_2022_paper.pdf">Enhancing Few-Shot Image Classification with Unlabelled Examples</a>; WACV 2022. </p> <p> [5] Nikita Dvornik, Cordelia Schmid, Julien Mairal; <a href="ttps://arxiv.org/abs/2003.09338">Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification</a>; ECCV 2020. </p> <p> [6] Lu Liu, William Hamilton, Guodong Long, Jing Jiang, Hugo Larochelle; <a href="https://arxiv.org/abs/2006.11702">Universal Representation Transformer Layer for Few-Shot Image Classification</a>; ICLR 2021. </p> <p> [7] Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin; <a href="https://arxiv.org/pdf/2105.07029.pdf">Learning a Universal Template for Few-shot Dataset Generalization</a>; ICML 2021. </p> <p> [8] Yanbin Liu, Juho Lee, Linchao Zhu, Ling Chen, Humphrey Shi, Yi Yang; <a href="https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_A_Multi-Mode_Modulator_for_Multi-Domain_Few-Shot_Classification_ICCV_2021_paper.pdf">A Multi-Mode Modulator for Multi-Domain Few-Shot Classification</a>; ICCV 2021. </p> <p> [9] Tonmoy Saikia, Thomas Brox, Cordelia Schmid; <a href="https://arxiv.org/abs/2001.07926">Optimized Generic Feature Learning for Few-shot Classification across Domains</a>; arXiv 2020. </p> <p> [10] Carl Doersch, Ankush Gupta, Andrew Zisserman; <a href="https://arxiv.org/abs/2007.11498">CrossTransformers: spatially-aware few-shot transfer</a>; NeurIPS 2020. </p> </div>

Other Usage

Train a Vanilla Multi-domain Learning Network

To train a vanilla multi-domain learning network (MDL) on Meta-Dataset, run:

./scripts/train_resnet18_mdl.sh

Other Classifiers for Meta-Testing (optional)

One can use other classifiers for meta-testing, e.g. use --test.loss-opt to select nearest centroid classifier (ncc, default), support vector machine (svm), logistic regression (lr), Mahalanobis distance from Simple CNAPS (scm), or k-nearest neighbor (knn); use --test.feature-norm to normalize feature (l2) or not for svm and lr; use --test.distance to specify the feature similarity function (l2 or cos) for NCC.

To evaluate the feature extractor with NCC and cosine similarity, run:

python test_extractor.py --test.loss-opt ncc --test.feature-norm none --test.distance cos --model.name=url --model.dir <directory of url> 

Five-shot and Five-way-one-shot Meta-test (optional)

One can evaluate the feature extractor in meta-testing for five-shot or five-way-one-shot setting by setting --test.type as '5shot' or '1shot', respectively.

To test the feature extractor for varying-way-five-shot on the test splits of all datasets, run:

python test_extractor.py --test.type 5shot --test.loss-opt ncc --test.feature-norm none --test.distance cos --model.name=url --model.dir <directory of url>

If one wants to evaluate our proposed URL and TSA method in 5-shot or 5-way-1-shot settings, please use test_extractor_pa.py and test_extractor_tsa.py with setting --test.type as '5shot' or '1shot'.

Acknowledge

We thank authors of Meta-Dataset, SUR, Residual Adapter for their source code.

Contact

For any question, you can contact Wei-Hong Li.

Citation

If you use this code, please cite our papers:

@article{li2023Universal,
    author    = {Li, Wei-Hong and Liu, Xialei and Bilen, Hakan},
    title     = {Universal Representations: A Unified Look at Multiple Task and Domain Learning},
    journal   = {International Journal of Computer Vision},
    pages     = {1--25},
    year      = {2023},
    publisher = {Springer}
}

@inproceedings{li2022TaskSpecificAdapter,
    author    = {Li, Wei-Hong and Liu, Xialei and Bilen, Hakan},
    title     = {Cross-domain Few-shot Learning with Task-specific Adapters},
    booktitle = {IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022}
}

@inproceedings{li2021Universal,
    author    = {Li, Wei-Hong and Liu, Xialei and Bilen, Hakan},
    title     = {Universal Representation Learning From Multiple Domains for Few-Shot Classification},
    booktitle = {IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {9526-9535}
}

@inproceedings{li2020knowledge,
    author    = {Li, Wei-Hong and Bilen, Hakan},
    title     = {Knowledge distillation for multi-task learning},
    booktitle = {European Conference on Computer Vision (ECCV) Workshop},
    year      = {2020}
}