Home

Awesome

PyTorch-BigNAS

<p> <a href="https://img.shields.io/badge/Python-%3E%3D3.7-blue"><img src="https://img.shields.io/badge/Python-%3E%3D3.7-blue"></a> <a href="https://img.shields.io/badge/PyTorch-1.9-informational"><img src="https://img.shields.io/badge/PyTorch-1.9-informational"></a> <a href="https://img.shields.io/badge/License-MIT-brightgreen"><img src="https://img.shields.io/badge/License-MIT-brightgreen"></a> </p>

Unofficial PyTorch implementation of BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models.

Yu J, Jin P, Liu H, et al. Bignas: Scaling up neural architecture search with big single-stage models[C]//European Conference on Computer Vision. Springer, Cham, 2020: 702-717.

Paper link: Arxiv

Requirements

Other requirements are listed in the requirements.txt.

Data Preparation

It is highly recommended to save or link datasets to the $pytorch-BigNAS/data folder, thus no additional configuration is required. However, manually setting the path for datasets is also available by modifying the cfg.LOADER.DATAPATH attribute in the configuration file $pytorch-BigNAS/core/config.py.

BigNAS uses ImageNet for training only, thus you can simply link ImageNet to the data/imagenet folder like this:

ln -s /PATH/TO/IMAGENET $pytorch-BigNAS/data/imagenet

Run example

Adjust the batch size (and other parameters accordingly) if out of memory (OOM) occurs.

The configuration file can be overridden by adding or modifying additional parameters on the command line. For example, run eval.py with the modified output directory could be like: python eval.py --cfg configs/eval.yaml OUT_DIR exp/220621/

Results

We performed the supernet training period using 8 Nvidia Tesla V100 GPUs, and the total time spent was about 16 days. The supernet training results are as follows:

ArchAccuracy (top-1)Accuracy (top-5)
MAX (final)79.6093.89
MAX (best)79.8994.52
MIN (final)74.3291.46
MIN (best)74.7291.84

We note that this produces a performance degradation of about 1% from the original paper. This may be due to different specific settings. In practice, we are following most configurations from AttentiveNAS. However, since we reduce the number of running GPUs, the exact results may vary depending on the different hyperparameters. The parts that we have modified in the actual runtime are as follows:

Modified itemsIn paperThis code
#GPUs648
batch_size per GPU6480
learning_rate0.2560.128
optimizerRMSPropSGD
weight_decay1e-5 for the biggest arch only3e-6 for all

The learning curves from the tensorboard are as follows:

tensorboard_results

Note:

We notice that the learning curves of the supernet produced a large oscillation under this setting, which may cause the performance drop and training instability. However, due to the resource constraints (125 GPU-days per training using Tesla V100), I am unable to make detailed adjustments and fine-tune the hyper-parameters at this time. If you use this code to run and get better results, we would appreciate you submitting it to us via Pull Requests.

Contributing

We welcome contributions to the library along with any potential issues or suggestions.

Also, if you find this code useful, please consider leaving a star🌟.

Reference

This implementation is mainly adapted from the source code of XNAS, AttentiveNAS, and OFA.

About XNAS

XNAS is an effective, modular, and flexible Neural Architecture Search (NAS) repository, which aims to provide a common framework and baselines for the NAS community. It is originally designed to decouple the search space, search algorithm, and performance evaluation strategy to achieve a freely combinable NAS.