Awesome
This repository contains a PyTorch implementation code for reproducing the results in our paper:
Generalization in Machine Learning via Analytical Learning Theory
Kenji Kawaguchi, Yoshua Bengio, Vikas Verma, and Leslie Pack Kaelbling
Test error (%) with WideResNet28_10 and different regularization methods
Regularization Method | CIFAR-10 | CIFAR-100 | SVHN |
---|---|---|---|
Standard | 3.79 ± 0.07 | 19.85 ± 0.14 | 2.47 ± 0.04 |
Single-cutout | 3.19 ± 0.09 | 18.13 ± 0.28 | 2.23 ± 0.03 |
Dual-cutout | 2.61 ± 0.04 | 17.54 ± 0.09 | 2.06 ± 0.06 |
- Dual-cutout is proposed in our paper based on a new learning theory.
How to run DualCutout
python cifar10/resnext/main.py --dualcutout --dataset cifar10 --arch wrn28_10 \
--epochs 300 --batch_size 64 --learning_rate 0.1 --data_aug 1 --decay 0.0005 --schedule 150 225 \
--gamma 0.1 0.1 --alpha 0.1 --cutsize 16
Add the --temp_dir and --home_dir as appropriate in the above commands. For Cifar10 and Cifar100, we used --cutsize 16, and for SVHN, we used --cutsize 20.
How to run Single Cutout
python cifar10/resnext/main.py --singlecutout --dataset cifar10 --arch wrn28_10 \
--epochs 300 --batch_size 64 --learning_rate 0.1 --data_aug 1 --decay 0.0005 --schedule 150 225 \
--gamma 0.1 0.1 --alpha 0.1 --cutsize 16
How to run baseline
python cifar10/resnext/main.py --dataset cifar10 --arch wrn28_10 \
--epochs 300 --batch_size 64 --learning_rate 0.1 --data_aug 1 --decay 0.0005 --schedule 150 225 \
--gamma 0.1 0.1
This code has been tested with
python 2.7.9
torch 0.3.1
torchvision 0.2.0