Home

Awesome

P-DIFF: Learning Classifier with Noisy Labels based on Probability Difference Distributions

ICPR2020 Paper link

P-DIFF+: Improving Learning Classifier with Noisy Labels by Noisy Negative Learning Loss

Neural Networks Paper link

Contents

  1. Requirements
  2. Dataset
  3. Folders
  4. How-to-train
  5. Usage-of-P-DIFF-layer
  6. Complexity-of-P-DIFF
  7. Experiment-environment
  8. Experiment-settings
  9. Model-list

Requirements

  1. Python
  2. Caffe

Dataset

Training and Testing dataset:

  1. mnist
  2. cifar10
  3. cifar100
  4. [miniimage]
  5. Cloth1M

Folders

The structure of code folders:

DatasetDescription
README.mdThe detailed instruction for P-DIFF reproduction.
train.shThe training entry script of P-DIFF.
test.shThe testing entry script of P-DIFF.
caffeThe compiled official caffe repo.
codeSome data downloading and processing codes saved in this folder.
dataThe training and testing datasets used in paper.
layerThe implement of P-DIFF layer in caffe.
logThe folder used to save training logs.
modelsThe folder used to save training models.
prototxtThe prototxt files used to train or test models in different datasets.

How-to-train

We demostrate the training process of cifar10 dataset contains 50% symmetry noise

Pipeline:

Step 1. Clone caffe repo to ./caffe folder and compile it after install its requirements.
        cd caffe
        mkdir build
        cd build
        cmake ..
        make -j8

Step 2. Add P-DIFF layer to caffe layers and recompile caffe project.

Step 3. Download mnist, cifar-10, cifar-100 and cloth1m datasets.(You can contact the author to download miniimage)
        python ./code/download.py --dataset=cifar10

Step 4. Corrupt the labels of training dataset by using ./code/corrupt.py script
        python ./code/corrupt.py --dataset=cifar10 --noise_type=SYMMETRY --noise_rate=0.50

Step 5. Generate respective lmdb by using caffe's converting tool(need multi-label supporting).
        bash ./code/convert.sh cifar10 SYMMETRY 50

Step 6. Configure the training dataset path in train_val.prototxt file.
        edit the parameters of ${noise_type}, ${noise_rate} and p_diff_layer in ./prototxt/train_val.prototxt.cifar10
Step 7. Train the dataset by using command of caffe.
        bash ./train.sh cifar10

Step 8. Test the dataset by using command of caffe.
        bash ./test.sh cifar10 SYMMETRY 50

Usage-of-P-DIFF-layer

The usage of P-DIFF layer in train_val.prototxt is described below:

layer {
  name: "fix_prob"
  type: "PDIFF"

  # input of this layer
  # bottom[0] is used for the forward of network
  bottom: "prob"

  # bottom[1] is used for computing sample weight, which could be different from bottom[0]
  bottom: "prob"

  # bottom[2] is the class label which indicate the sample's category
  bottom: "label000"

  # bottom[3](optional) is the noise label which indicate whether the sample is noise,
  # just used for drawing pdf_clean and pdf_noise, not used for training.
  # The second label is generated by multi-label lmdb converter.
  # We can discard this input in general.
  #bottom: "label001"

  # output of this layer
  top: "fix_prob"

  # parameters of this layer
  p_diff_param {
    # We use a queue the maintain the delta distribution, the size of queue is slide_batch_num x batch_size
    slide_batch_num: 100

    # The iteration number per epoch
    # Its value equal to total training samples number divide batch_size
    # Here is 50,000 / 128
    epoch_iters: 390

    # This is the switch whether use auto noise method:
    # "on" means this layer will compute a noise rate automatically,
    # "off" means this layer will use a specific noise rate.
    use_auto_noise_ratio: false

    # if the switch of use_auto_noise_ratio is off, we need to set the specific noise rate.
    noise_ratio: 0.50

    # print some training information, like noise rate, threshold of zeta, pcf, weight, etc.
    # which are used for debugging and draw figures.
    # we turn off this switch usually.
    #debug: true
    # this is the prefix of log file name, which stored the information under debug mode.
    #debug_prefix: "cifar10_noise_symmetric_50"
  }
}

Complexity

If we have known the batch size, slide batch num and bin size (200 in our training)<br/>

The time complexity of P-DIFF per iteration is<br/>

O(batch_size) x [O(bin_size) + O(slide_batch_num) + O(k)]

k is a constant value.

The space complexity of P-DIFF per iteration is<br/>

O(slide_batch_num) x O(batch_size) + k x O(bin_size)

k is a constant value.

Experiment-environment

Hardware:
Operation System:

Experiment-settings

Dataset preprocess:
Training parameters:
Backbone:
Training parameters:

Citation: