Home

Awesome

MixUp

This is an implement and Improvement  on mixup: Beyond Empirical Risk Minimization https://arxiv.org/abs/1710.09412

The improvement

  1. add backward
  2. add mix rate

Two scenes:

image

The detail design of MixUp layer:

image

The results:

The symbol of resnet50 is writen by mxnet https://github.com/apache/incubator-mxnet/tree/master/example/image-classification/symbols, there have many versions. And i havenot do any optimizion for it. All the results are based on this baseline.

        cifar10             alpha       mix_ratetest Accinitial learning ratebatch size
(ERM)resnet50 90epoch--0.879003906250.05256
(ERM)resnet50 200epoch--0.893652343750.05256
(ERM)resnet50 300epoch--0.89316406250.05256
(mixup)resnet50 90epoch0.20.70.86093750.7256
(mixup)resnet50 200epoch0.20.70.916113281250.7256
(mixup)resnet50 300epoch0.20.70.92246093750.7256
mixup in feature maps(resnet50 head conv)90epoch    0.2    0.7        0.85449218750.7256
mixup in feature maps(resnet50 head conv)200epoch    0.2    0.7        0.917968750.7256
mixup in feature maps(resnet50 head conv)300epoch    0.2    0.7        0.918457031250.7256

MixUp

image

Mixup in feature map (resnet50 head conv)

image

ERM

image

Usage

install mxnet0.12 The mixup is in:symbols/mixup.py you can use it in your codes like:

data ,label = mx.sym.Custom(data= data,label = label,alpha = 0.2,num_classes = num_classes,batch_size = batch_size,mix_rate =0.7,op_type = 'MixUp')

label is the vector like [4,8,...9]

download the dataset

http://data.mxnet.io/data/cifar10/cifar10_val.rec

http://data.mxnet.io/data/cifar10/cifar10_train.rec

train & test:

./train.sh
./test.sh

Reference

Zhang H, Cisse M, Dauphin Y N, et al. mixup: Beyond Empirical Risk Minimization[J]. arXiv preprint arXiv:1710.09412, 2017.