Home

Awesome

Model

This is an implementation of the deep residual network used for Mini-Places2 as described in He et. al., "Deep Residual Learning for Image Recognition". The model is structured as a very deep network with skip connections designed to have convolutional parameters adjusting to residual activations. The training protocol uses minimal pre-processing (mean subtraction) and very simple data augmentation (shuffling, flipping, and cropping). All model parameters (even batch norm parameters) are updated using simple stochastic gradient descent with weight decay. The learning rate is dropped only twice (at 90 and 135 epochs in the paper).

Acknowledgments

Many thanks to Dr. He and his team at MSRA for their helpful input in replicating the model as described in their paper.

Model script

The model train script is included at (miniplaces_msra.py).

Trained weights

The trained weights file can be downloaded from AWS (miniplaces_msra_e66.pkl)

Performance

Training this model with the options described below should be able to achieve roughly 17.5% top-5 error using only mean subtraction, random cropping, and random flips. With multiscale evaluation (see the evaluation script), the model should achieve roughly 14.6% top-5 error.

Instructions

This script was tested with neon version 1.2. Make sure that your local repo is synced to this commit and run the installation procedure before proceeding. Commit SHA for v1.2 is 385483881ee1fe1f0445fc78d7edf5b8ddc5c8c5

This example uses the ImageLoader module to load the images for consumption while applying random cropping, flipping, and shuffling. Prior to beginning training, you need to write out the padded mini-places2 images into a macrobatch repository. See miniplaces_batchwriter.sh.

Note that it is good practice to choose your data_dir to be local to your machine in order to avoid having ImageLoader module perform reads over the network.

Once the batches have been written out, you may initiate training:

miniplaces_msra.py -r 0 -vv \
    --log <logfile> \
    --epochs 80 \
    --save_path <model-save-path> \
    --eval_freq 1 \
    --backend gpu \
    --data_dir <path-to-saved-batches>

If you just want to run evaluation, you can use the much simpler script that loads the serialized model and evaluates it on the validation set:

miniplaces_eval.py -vv --model_file <model-save-path>