Home

Awesome

chainerkfac

A Chainer extension for training deep neural networks with Kronecker-Factored Approximate Curvature (K-FAC).

Implementation for

Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, and Satoshi Matsuoka. 
Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks. 
CVPR, 2019.

[paper] [poster]

Installation

Clone the code from GitHub.

$ git clone https://github.com/tyohei/chainerkfac.git chainerkfac

Change the directory and install.

$ cd chainerkfac
$ python setup.py install

This table describes the additional required libraries to install before the installation of chainerkfac.

Running environmentAdditional required libraries
Single GPUCuPy
Multiple GPUsCuPy with NCCL, MPI4py
Multiple GPUs for ImageNet scriptCuPy with NCCL, MPI4py, Pillow

See CuPy installation guide and ChainerMN installation guide for details.

Examples

MNIST (codes) / CIFAR-10 (codes)

Training with a single CPU

$ python train.py --no_cuda

Training with a single GPU

$ python train.py

Training with multiple GPUs (4GPUs)

$ mpirun -np 4 python train.py --distributed

ImageNet (codes)

Training with multiple GPUs (4GPUs)

$ mpirun -np 4 python train.py \
<path/to/train.txt> <path/to/val.txt> \
--train_root <path/to/train_root> \
--val_root  <path/to/val_root> \
--mean ./mean.npy \
--config <path/to/config_file>

Training ResNet-50 on ImageNet with large mini-batch

Mini-batch sizeconfig fileEpochsIterationsTop-1 Accuracy
4,096configs/bs4k.resnet50.128gpu.json3510,94875.9 %
8,192configs/bs8k.resnet50.256gpu.json355,47876.4 %
16,384configs/bs16k.resnet50.512gpu.json352,73776.6 %
32,768configs/bs32k.resnet50.1024gpu.json451,76076.9 %
65,536configs/bs64k.resnet50.2048gpu.json601,17376.3 %
131,072configs/bs128k.resnet50.4096gpu.json8078275.0 %

Authors

Yohei Tsuji (@tyohei), Kazuki Osawa (@kazukiosawa), Yuichiro Ueno (@y1r) and Akira Naruse (@anaruse)