Home

Awesome

Caffe Windows

Based on @terrychenism's caffe-windows-cudnn with the following major changes.

Linux: Have a look at @Senecaur's version here.

Note: This implementation here is for my project in Kaggle's National Data Science Bowl. So, some choices in the code maybe specifc to the problem, and don't represent the general one, e.g., stochastic prediction as mentioned below.

I have put one of my model for Kaggle's National Data Science Bowl in /examples/kaggle-bowl.

COMPACT_DATA layer to hold varying size images

This is modified from the Princeton's GoogLeNet patch.

Usage

To use this layer, you have to convert the image to compact version of leveldb after building /bin/convert_imageset_compact.exe (the usage is the same with /bin/convert_imageset.exe).

Since the image can be of varying sizes, it might be problem when computing the mean image for this layer. I use the following method for this issue and it works ok.

Note

In this code, I turn off the iscolor flag in the function call cvDecodeImage in this line and this line. As a result, this layer will convert every image to grayscale. If you want color one, you can set iscolor to 1.

Realtime data augmentation

Realtime data augmentation is implemented within the COMPACT_DATA layer. It offers:

Usage

To use it, you can specify

## Training set
layers {
  name: "Image"
  type: COMPACT_DATA
  top: "data"
  top: "label"
  data_param {
    source: "path-to-training-compact-leveldb"
    batch_size: 100
  }
  transform_param {
    mean_file: "path-to-image-mean"
    mirror: true
    crop_size: 32
    multiscale: true
    debug_display: false  
    smooth_filtering: false
    jpeg_compression: false
    contrast_adjustment: false
    min_scaling_factor: 0.8
    max_scaling_factor: 1.2
    angle_interval: 45
    max_shearing_ratio: 0.1
    max_perspective_ratio: 0.1
    warp_fillval: 255
  }
  include: { phase: TRAIN }
}
## Validation set
layers {
  name: "Image"
  type: COMPACT_DATA
  top: "data"
  top: "label"
  data_param {
    source: "path-to-validation-compact-leveldb"
    batch_size: 100
  }
  transform_param {
    mean_file: "path-to-image-mean"
    mirror: true
    crop_size: 32
    multiscale: true
    debug_display: false	
    smooth_filtering: false
    jpeg_compression: false
    contrast_adjustment: false
    min_scaling_factor: 0.8
    max_scaling_factor: 1.2
    angle_interval: 45
    max_shearing_ratio: 0.1
    max_perspective_ratio: 0.1
    warp_fillval: 255
  }
  include: { phase: TEST }
}

There is an example using realtime data augmentation for Kaggle's National Data Science Bowl in /examples/kaggle-bowl.

Parameter

Transformations parameter accepts parameters:

Here is a concrete example about the geometric transformation. In the above prototxt config, let's say the net encounter an image with original size 48x60, and the scaling factor for h(eight) and w(idth) direction is randomly sampled as 0.8 and 1.2, which corresponds to a ROI of size 60x50 (h: 48/0.8=60, w: 60/1.2=50). In this case, for h direction, we will randomly pad additional 12 pixels in both side (these pixels will be set to warp_fillval); and for w direction, will randomly crop out extra 10 pixels on both side. With the resulted 60x50 ROI, we will perform random rotation/shearing/perspective warpping in combination using the function warpPerspectiveOneGo in /src/caffe/util/opencv_util.cpp. The output will then be a transformed image of size 32x32. This is the image we feed to the net.

For a better understanding of the transformation augmentation and the above params, please see /src/caffe/data_transformer.cpp (the transformation is implemented here) and /src/caffe/proto/caffe.proto.

For transformation augmentation for image classification, I would like to recommend this paper: Transformation Pursuit for Image Classification. The authors have a project page for it.

Note

In this implemetnation, realtime augementation is always on in both TRAIN and TEST phase (even the mirror operation which is disabled in Caffe version). This suits the need for ensemble: you can run the trained model with the same input image a few times and average those predictions (they won't be the same due to random distortions) to get the final one.

If you want deterministic prediction, you can hack the code or using something like:

## Validation set
layers {
  name: "Image"
  type: COMPACT_DATA
  top: "data"
  top: "label"
  data_param {
    source: "path-to-validation-compact-leveldb"
    batch_size: 100
  }
  transform_param {
    mean_file: "path-to-image-mean"
    mirror: false
    crop_size: 32
    multiscale: true
    debug_display: false  
    smooth_filtering: false
    jpeg_compression: false
    contrast_adjustment: false
    min_scaling_factor: 1
    max_scaling_factor: 1
    angle_interval: 360
    max_shearing_ratio: 0
    max_perspective_ratio: 0
    warp_fillval: 255
  }
  include: { phase: TEST }
}

Note the random mirroring is still on ;)

Prediction module to get probability

It is within the same /bin/caffe.exe interface and usage is as follow:

# make prediction
./bin/caffe.exe predict \
  --model=path-to-model-prototxt \
  --weights=path-to-trained-model \
  --outfile=path-to-output-prediction \
  --label_number=number-of-label \
  --iterations=iteration-to-run \
  --score_index=which-score-to-output \
  --gpu=gpu-id \
  --random_seed=random-seed \
  --phase=TRAIN-or-TEST

Batch Normalization layer

Batch Normalization is from here.

This implementation has be adopted in this PR to Caffe (with improvements such as per mini-batch shuffling).

Usage

Two additional blobs (besides those for the learnable parameters) are used for storing moving average mean and variance. So set the corresponding blobs_lr and weight_decay both to 0, as follows:

## BN
layers {
  bottom: "conv1"
  top: "conv1_bn"
  name: "conv1_bn"
  type: BN
  blobs_lr: 1
  blobs_lr: 1
  blobs_lr: 0
  blobs_lr: 0
  weight_decay: 0
  weight_decay: 0
  weight_decay: 0
  weight_decay: 0
  bn_param {
    scale_filler {
      type: "constant"
      value: 1
    }
    shift_filler {
      type: "constant"
      value: 0
    }
    var_eps: 1e-10
    moving_average: true
    decay: 0.95    
  }
}

There is an example for BN using mnist in /examples/mnist.

Parameter

BN parameter accepts parameters:

If you want minibatch statistics for inference, set moving_average to false.

PReLU layer and MSRA filler

PReLU is adopted from this PR to Caffe.

AdaDelta solver

AdaDelta is based on this PR to Caffe with a modification to allow learning rate policy as usual.

Accumulated gradient method for SGD & Nesterov solver

Adopted from Princeton's GoogLeNet patch.