Home

Awesome

Learning to Weight Samples for Dynamic Early-exiting Networks (ECCV 2022)

Yizeng Han* , Yifan Pu*, Zihang Lai, Chaofei Wang, Shiji Song, Junfeng Cao, Wenhui Huang, Chao Deng, Gao Huang.

*: Equal contribution.

Introduction

This repository contains the implementation of the paper, Learning to Weight Samples for Dynamic Early-exiting Networks (ECCV 2022). The proposed method adopts a weight prediction network to weight the training loss of different samples for dynamic early-exiting networks, such as MSDNet and RANet, and improves their performance in the dynamic early exiting scenario.

Overall idea

<img src="./figs/fig1.jpg" alt="fig1" style="zoom:60%;" />

Training pipeline

fig2

Gradient flow of the meta-learning algorithm

fig3

Usage

Dependencies

Scripts

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python tools/main_imagenet_DDP.py \
--train_url YOUR_SAVE_PATH \
--data_url YOUR_DATA_PATH --data ImageNet --workers 64 --seed 0 \
--arch msdnet --nBlocks 5 --stepmode even --step 4 --base 4 --nChannels 32 --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
--meta_net_hidden_size 500 --meta_net_num_layers 1 --meta_interval 100 --meta_lr 1e-4 --meta_weight_decay 1e-4 \
--epsilon 0.3 --target_p_index 15 --meta_net_input_type loss --constraint_dimension mat \
--epochs 100 --batch-size 4096 --lr 0.8 --lr-type cosine --print-freq 10
hfai python tools/main_imagenet_DDP_HF.py \
--train_url YOUR_SAVE_PATH \
--data_url YOUR_DATA_PATH --data ImageNet --workers 64 --seed 0 \
--arch msdnet --nBlocks 5 --stepmode even --step 4 --base 4 --nChannels 32 --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
--meta_net_hidden_size 500 --meta_net_num_layers 1 --meta_interval 100 --meta_lr 1e-4 --meta_weight_decay 1e-4 \
--epsilon 0.3 --target_p_index 15 --meta_net_input_type loss --constraint_dimension mat \
--epochs 100 --batch-size 4096 --lr 0.8 --lr-type cosine --print-freq 10 \
-- --nodes=1 --name=YOUR_EXPERIMENT_NAME
CUDA_VISIBLE_DEVICES=0 python tools/eval_imagenet.py \
--data ImageNet --batch-size 512 --workers 8 --seed 0 --print-freq 10 --evalmode anytime \
--arch msdnet --nBlocks 5 --stepmode even --step 4 --base 4 --nChannels 32 --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
--data_url YOUR_DATA_PATH \
--train_url YOUR_SAVE_PATH \
--evaluate_from YOUR_CKPT_PATH
CUDA_VISIBLE_DEVICES=0 python tools/eval_imagenet.py \
--data ImageNet --batch-size 512 --workers 2 --seed 0 --print-freq 10 --evalmode dynamic \
--arch msdnet --nBlocks 5 --stepmode even --step 4 --base 4 --nChannels 32 --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
--data_url YOUR_DATA_PATH 
--train_url YOUR_SAVE_PATH  \
--evaluate_from YOUR_CKPT_PATH
CUDA_VISIBLE_DEVICES=0 python tools/main_cifar_DDP.py \
--train_url YOUR_SAVE_PATH \
--data_url YOUR_DATA_PATH --data cifar100 --workers 1 --seed 1 \
--arch msdnet --nBlocks 5 --stepmode lin_grow --step 1 --base 1 --nChannels 16 \
--meta_net_hidden_size 500 --meta_net_num_layers 1 --meta_interval 1 --meta_lr 1e-4 --meta_weight_decay 1e-4 \
--epsilon 0.8 --target_p_index 15 --meta_net_input_type loss --constraint_dimension col \
--epochs 300 --batch-size 1024 --lr 0.8 --lr-type cosine --print-freq 10

Results

result_cifar

result_IN

Pre-trained Models on ImageNet

model configepochslabelsmoothacc_exit1acc_exit2acc_exit3acc_exit4acc_exit5Checkpoint Link
step=4100N/A59.5467.2271.0372.3373.93Tsinghua Cloud / Google Drive
step=6100N/A60.0569.1373.3375.1976.30Tsinghua Cloud / Google Drive
step=7100N/A59.2469.6573.9475.6676.72Tsinghua Cloud / Google Drive
step=43000.161.6467.8971.6173.8275.03Tsinghua Cloud / Google Drive
step=63000.161.4170.7074.3875.8076.66Tsinghua Cloud / Google Drive
step=73000.160.9471.8875.1376.0376.82Tsinghua Cloud / Google Drive

Contact

If you have any questions, please feel free to contact the authors.

Yizeng Han: hanyz18@mails.tsinghua.edu.cn, yizeng38@gmail.com.

Yifan Pu: pyf20@mails.tsinghua.edu.cn, yifanpu98@126.com.

Ackowledgements

We use the pytorch implementation of MSDNet-PyTorch, RANet-PyTorch and IMTA in our experiments.