Home

Awesome

Att-DARTS: Differentiable Neural Architecture Search for Attention

The PyTorch implementation of Att-DARTS: Differentiable Neural Architecture Search for Attention.

The codes are based on https://github.com/dragen1860/DARTS-PyTorch.

Requirements

We recommend downloading PyTorch from here.

<!-- If you use pipenv, simply run: ``` pipenv install ``` Or, using pip: ``` pip install -r requirements.txt ``` -->

Datasets

Results

CIFAR

CIFAR-10CIFAR-100Params(M)
DARTS2.76 ± 0.0916.69 ± 0.283.3
Att-DARTS2.54 ± 0.1016.54 ± 0.403.2

ImageNet

top-1top-5Params(M)
DARTS26.78.74.7
Att-DARTS26.08.54.6

Usage

Architecture search (using small proxy models)

Our script occupies all available GPUs. Please set environment CUDA_VISIBLE_DEVICES.

To carry out architecture search using 2nd-order approximation, run:

python train_search.py --unrolled

The found cell will be saved in genotype.json. Our resultant Att_DARTS is written in genotypes.py.

Inserting an attention at other locations is supported through the --location flag. The locations are specified at AttLocation in model_search.py.

Architecture evaluation (using full-sized models)

To evaluate our best cells by training from scratch, run:

python train_CIFAR10.py --auxiliary --cutout --arch Att_DARTS  # CIFAR-10
python train_CIFAR100.py --auxiliary --cutout --arch Att_DARTS  # CIFAR-100
python train_ImageNet.py --auxiliary --arch Att_DARTS  # ImageNet

Customized architectures are supported through the --arch flag once specified in genotypes.py.

Also, you can designate the search result in .json through the --arch_path flag:

python train_CIFAR10.py --auxiliary --cutout --arch_path ${PATH}  # CIFAR-10
python train_CIFAR100.py --auxiliary --cutout --arch_path ${PATH}  # CIFAR-100
python train_ImageNet.py --auxiliary --arch_path ${PATH}  # ImageNet

where ${PATH} should be replaced by the path to the .json.

The trained model is saved in trained.pt. After training, the test script automatically runs.

Also, you can always test the trained.pt as indicated below.

Test (using full-sized pretrained models)

To test a pretrained model saved in .pt , run:

python test_CIFAR10.py --auxiliary --model_path ${PATH} --arch Att_DARTS  # CIFAR-10
python test_CIFAR100.py --auxiliary --model_path ${PATH} --arch Att_DARTS  # CIFAR-100
python test_imagenet.py --auxiliary --model_path ${PATH} --arch Att_DARTS  # ImageNet

where ${PATH} should be replaced by the path to .pt.

You can designate our pretrained models (cifar10_att.pt, cifar100_att.pt, imagenet_att.pt) or the saved trained.pt in Architecture Evaluation.

Also, we support customized architectures specified in genotypes.py through the --arch flag, or architectures specified in .json through the --arch_path flag.

Visualization

You can visualize the found cells in genotypes.py. For example, you can visualize Att-DARTS running:

python visualize.py Att_DARTS

Also, you can visualize the saved cell in .json:

python visualize.py genotype.json

Related Work

Attention modules

This repository includes the following attentions:

Reference

@inproceedings{att-darts2020IJCNN,
author = {Nakai, Kohei and Matsubara, Takashi and Uehara, Kuniaki},
booktitle = {The International Joint Conference on Neural Networks (IJCNN)},
title = {{Att-DARTS: Differentiable Neural Architecture Search for Attention}},
year = {2020}
}