Awesome

This repository is for the "Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization" accepted by IEEE CVPR 2020.

Requirements

Python == 3.6.2, PyTorch == 1.0.0, cuda-9.0, cudnn7.1-9.0

Please download the CIFAR100 dataset in https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz, and save it in the 'data' folder

Pretrained models

Test on CIFAR10 with the best reported architecture with NSAS

cd CNN && python test.py --auxiliary --model_path ./trained_models/Random_NSAS_CIFAR10_best.pt --arch Random_NSAS

Expected result: 2.50% test error rate with 3.08M model params.
Test on CIFAR100 with the best reported architecture with NSAS

cd CNN && python test_100.py --auxiliary --model_path ./trained_models/Random_NSAS_CIFAR100.pt --arch Random_NSAS

Expected result: 17.56% top1 test error with 3.13M model params.
Test on CIFAR100 with the best reported architecture with NSAS-C

cd CNN && python test_100.py --auxiliary --model_path ./trained_models/Random_C_CIFAR100/weights.pt --arch Random_NSAS_C

Expected result: 16.69% top1 test error with 3.59M model params.
Test on ImageNET with the best reported architecture with NSAS-C

cd CNN && python test_imagenet.py --auxiliary --model_path ./trained_models/Random_NSAS_C_imagenet/model_best.pth.tar --arch  Random_NSAS_C

Expected result: 25.5% top1 test error with 5.4M model params.
Test on ImageNET with NSAS-C based on PDARTS and PC-DARTS experimental settings

cd CNN && python test_imagenet.py --auxiliary --model_path ./trained_models/Random-NSAS-C with PDARTS setting/model_best.pth.tar --arch  Random_NSAS_C

Expected result: 24.68% top1 test error with 5.4M model params. Please notice that, in the Imagenet architecture evaluation, we follow PDART and PCDARTS to use the warm-up linear learning-rate shceduler. More details could be found in ./trained_models/Random-NSAS-C with PDARTS setting/TRAIN_IMAGENET_PDARTS.ipynb.
Test on PTB

cd RNN && python test.py --model_path ./trained_models/Random_NSAS_PTB.pt

Expected result: 56.84 test perplexity with 23M model params.

Architecture search (using small proxy models)

Conduct architecture search, run

cd CNN && python GDAS_NSAS_demo.py
cd CNN && python GDAS_NSAS_C_demo.py
cd CNN && python RandomNAS_NSAS_demo.py
cd CNN && python RandomNAS_NSAS_C_demo.py

Architecture evaluation (using full-sized models)

To evaluate our best cells by training from scratch, run

cd CNN && python train.py --auxiliary --cutout            # CIFAR-10
cd CNN && python train_100.py --auxiliary --cutout            # CIFAR-100
cd CNN && python train_imagenet.py --auxiliary        # imagenet
cd RNN && python train.py                                 # PTB

Architecture Visualization

Package graphviz is required to visualize the learned cells, visulize the best reported architectures in this paper

cd CNN && python visualize.py Random_NSAS 
cd CNN && python visualize.py Random_NSAS_C
cd RNN && python visualize.py Random_NSAS

Codes and Experimental results on NAS-Bench-201

Please find these codes and results in NAS-Bench-201 folder

Reference

If you use our code in your research, please cite our paper accordingly.

@inproceedings{zhang2020overcoming,
  title={Overcoming Multi-Model Forgetting in One-Shot NAS with Diversity Maximization},
  author={Zhang, Miao and Li, Huiqi and Pan, Shirui and Chang, Xiaojun and Su, Steven},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7809--7818},
  year={2020}
}