Home

Awesome

DNA

This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation.

<img src=https://user-images.githubusercontent.com/61453811/99777992-0d914600-2b4e-11eb-8022-d83a438de6d0.png width=90%/>

Illustration of DNA. Each cell of the supernet is trained independently to mimic the behavior of the corresponding teacher block.

<img src=https://user-images.githubusercontent.com/61453811/99778189-4f21f100-2b4e-11eb-8424-df182fb58962.png width=90%/>

Comparison of model ranking for DNA vs. DARTS, SPOS and MnasNet under two different hyper-parameters.

Our Trained Models

<img src=https://user-images.githubusercontent.com/61453811/99778983-5eee0500-2b4f-11eb-8c9f-882eb6c70eb1.png width=90%/>

Usage

1. Requirements

2. Searching

The code for supernet training, evaluation and searching is under searching directory.

i) Train & evaluate the block-wise supernet with knowledge distillation

ii) Search for the best architecture under constraint.

Our traversal search can handle a search space with 6 ops in each layer, 6 layers in each stage, 6 stages in total. A search process like this should finish in half an hour with a single cpu. To perform search over a larger search space, you can manually divide the search space or use other search algorithms such as Evolution Algorithms to process our evaluated architecture potential files.

iii) Searching with multiple cells in each block.

Please refer to the clarification from @MohanadOdema in this issue.

3. Retraining

The retraining code is simplified from the repo: pytorch-image-models and is under retraining directory.