Home

Awesome

Early-Bird-Tickets

ICLR2020: spotlight License: MIT

This is PyTorch implementation of Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks

ICLR 2020 spotlight oral paper

Table of Content

<!-- [TOC] --> <div class="toc"> <ul> <li><a href="#early-bird-tickets">Early-Bird-Tickets</a><ul> <li><a href="#table-of-content">Table of Content</a></li> <li><a href="#introduction">Introduction</a></li> <li><a href="#early-bird-tickets_1">Early-Bird Tickets</a><ul> <li><a href="#existence-of-early-bird-tickets">Existence of Early-Bird Tickets</a></li> <li><a href="#identify-early-bird-tickets">Identify Early-Bird Tickets</a></li> <li><a href="#efficient-training-via-early-bird-tickets">Efficient Training via Early-Bird Tickets</a></li> </ul> </li> <li><a href="#basic-usage">Basic Usage</a><ul> <li><a href="#prerequisites">Prerequisites</a></li> <li><a href="#core-training-options">Core Training Options</a></li> <li><a href="#standard-train-for-identifying-early-bird-tickets">Standard Train for Identifying Early-Bird Tickets</a></li> <li><a href="#retrain-to-restore-accuracy">Retrain to Restore Accuracy</a></li> <li><a href="#low-precision-search-and-retrain">Low Precision Search and Retrain</a></li> </ul> </li> <li><a href="#imagenet-experiments">ImageNet Experiments</a><ul> <li><a href="#resnet18-on-imagenet">ResNet18 on ImageNet</a></li> <li><a href="#resnet50-on-imagenet">ResNet50 on ImageNet</a></li> </ul> </li> <li><a href="#citation">Citation</a></li> <li><a href="#acknowledgement">Acknowledgement</a></li> </ul> </li> </ul> </div>

Introduction

Experiments based on various deep networks and datasets validate: 1) the existence of EB tickets, and the effectiveness of mask distance in efficiently identifying them; and 2) that the proposed efficient training via EB tickets can achieve up to 4.7x energy savings while maintaining comparable or even better accuracy, demonstrating a promising and easily adopted method for tackling cost-prohibitive deep network training.

Early-Bird Tickets

Existence of Early-Bird Tickets

To articulate the Early-Bird (EB) tickets phenomenon: the winning tickets can be drawn very early in training, we perform ablation simulation using two representative deep models (VGG16 and PreResNet101) on two popular datasets (CIFAR10 and CIFAR100). Specifically, we follow the main idea of (Frankle & Carbin, 2019) but instead prune networks trained at earlier points to see if reliable tickets can be drawn. We adopt the same channel pruning in (Liu et al., 2017) as pruning techniuqes for all experiments since it aligns with our end goal of efficient trianing. Below figure demonstrates the existence of EB tickets (p = 30% means 30% weights are pruned, hollow star means retraining accuracy of subnetwork drawn from checkpoint with best accuracy in search stage).

Identify Early-Bird Tickets

we visialize distance evolution process among the tickets drawn from each epoch. Below figure plots the pairwise mask distance matrices (160 x 160) of the VGG16 and PreResNet101 experiments on CIFAR100 at different pruning ratio p, where (i, j)-th element in a matrix denotes the mask distance between epochs i and j in that corresponding experiment. A lower distance (close to 0) indicates a smaller mask distance and is colored warmer.

<!-- ![](./assets/overlap.png) --> <div align=center> <img src="./assets/overlap.png" width = "800" alt="overlap" /> </div>

Our observation that the ticket masks quickly become stable and hardly changed in early training stages supports drawing EB tickets. We therefore measure the mask distance consecutive epochs, and draw EB tickets when such distance is smaller than a threshold. Practically, to improve the reliability of EB tickets, we will stop to draw EB tickets when the last five recorded mask distances are all smaller than given threshold.

Efficient Training via Early-Bird Tickets

Instead of adopting a three-step routine of 1) training a dense model, 2) pruning it and 3) then retraining the pruned model to restore performance, and these three steps can be iterated, we leverage the existence of EB tickets to develop EB Train scheme which replaces the aforementioned steps 1 and 2 with a lower-cost step of detecting the EB tickets.

<!-- ![](./assets/eb-train.png) --> <div align=center> <img src="./assets/eb-train.png" width = "700" alt="eb-train" /> </div> <br>

Basic Usage

Prerequisites

The code has the following dependencies:

Core Training Options

Standard Train for Identifying Early-Bird Tickets

Example: Reproduce early-bird (EB) tickets on CIFAR-100

bash ./scripts/standard-train/search.sh
bash ./scripts/standard-train/prune.sh
bash ./scripts/standard-train/mask_distance.sh

After calculating mask distance matrix (automatically save as overlap-0.5.npy), u can call plot_overlap.py to draw figures.

Retrain to Restore Accuracy

Example: Retrain drawn EB tickets (e.g., VGG16 for CIFAR-100) to restore accuracy

bash ./scripts/standard-train/retrain_continue.sh
bash ./scripts/standard-train/retrain_scratch.sh

Low Precision Search and Retrain

We perform low precision method SWALP to both the search and retrian stages (refer to EB Train LL in Sec. 4.3 of paper). Below is the guidance taking VGG16 performed on CIFAR-10 as an example:

bash ./scripts/low-precision/search.sh
bash ./scripts/low-precision/prune.sh
bash ./scripts/low-precision/retrain_continue.sh
<div align=left> <img src="./assets/vgg-result.png" width = "800" alt="eb-train" /> </div>

ImageNet Experiments

All pretrained checkpoints of different pruning ratio have been collected in Google Drive. To evaluate the inference accuracy of test set, we provide evaluation scripts ( EVAL_ResNet18_ImageNet.py and EVAL_ResNet50_ImageNet.py ) and corresponding commands shown below for your convenience.

bash ./scripts/resnet18-imagenet/evaluation.sh
bash ./scripts/resnet50-imagenet/evaluation.sh

ResNet18 on ImageNet

bash ./scripts/resnet18-imagenet/search.sh
bash ./scripts/resnet18-imagenet/prune.sh
bash ./scripts/resnet18-imagenet/retrain_continue.sh
<div align=left> <img src="./assets/resnet18-result.png" width = "700" alt="eb-train" /> </div>

ResNet50 on ImageNet

bash ./scripts/resnet50-imagenet/search.sh
bash ./scripts/resnet50-imagenet/prune.sh
bash ./scripts/resnet50-imagenet/retrain_continue.sh

Citation

If you find this code is useful for your research, please cite:

@inproceedings{
you2020drawing,
title={Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks},
author={Haoran You and Chaojian Li and Pengfei Xu and Yonggan Fu and Yue Wang and Xiaohan Chen and Yingyan Lin and Zhangyang Wang and Richard G. Baraniuk},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=BJxsrgStvr}
}

Acknowledgement