Awesome

Collaborative Learning for Weakly Supervised Object Detection

If you use this code in your research, please cite

@inproceedings{ijcai2018-135,
  title     = {Collaborative Learning for Weakly Supervised Object Detection},
  author    = {Jiajie Wang and Jiangchao Yao and Ya Zhang and Rui Zhang},
  booktitle = {Proceedings of the Twenty-Seventh International Joint Conference on
               Artificial Intelligence, {IJCAI-18}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  pages     = {971--977},
  year      = {2018},
  month     = {7},
  doi       = {10.24963/ijcai.2018/135},
  url       = {https://doi.org/10.24963/ijcai.2018/135},
}

Prerequisites

A basic pytorch installation. The version is 0.2. If you are using the old version 0.1.12, you can checkout 0.1.12 branch.
Python packages you might not have: cffi, opencv-python, easydict (similar to py-faster-rcnn). For easydict make sure you have the right version. Xinlei uses 1.6.
tensorboard-pytorch to visualize the training and validation curve. Please build from source to use the latest tensorflow-tensorboard.

Installation

Clone the repository

git clone https://github.com/ruotianluo/pytorch-faster-rcnn.git

Choose your -arch option to match your GPU for step 3 and 4.

GPU model	Architecture
TitanX (Maxwell/Pascal)	sm_52
GTX 960M	sm_50
GTX 1080 (Ti)	sm_61
Grid K520 (AWS g2.2xlarge)	sm_30
Tesla K80 (AWS p2.xlarge)	sm_37

Note: You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs.

Build RoiPooling module

cd pytorch-faster-rcnn/lib/layer_utils/roi_pooling/src/cuda
echo "Compiling roi_pooling kernels by nvcc..."
nvcc -c -o roi_pooling_kernel.cu.o roi_pooling_kernel.cu -x cu -Xcompiler -fPIC -arch=sm_52
cd ../../
python build.py
cd ../../../

Build NMS

cd lib/nms/src/cuda
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_kernel.cu.o nms_kernel.cu -x cu -Xcompiler -fPIC -arch=sm_52
cd ../../
python build.py
cd ../../

Setup data

Please follow the instructions of py-faster-rcnn here to setup VOC. The steps involve downloading data and optionally creating soft links in the data folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore the steps that setup proposals.

If you find it useful, the data/cache folder created on Xinlei's side is also shared here.

Train your own model

Download pre-trained models and weights. For the pretrained wsddn model, you can find the download link here. For other pre-trained models like VGG16 and Resnet V1 models, they are provided by pytorch-vgg and pytorch-resnet (the ones with caffe in the name). You can download them in the data/imagenet_weights folder. For example for VGG16 model, you can set up like:

mkdir -p data/imagenet_weights
cd data/imagenet_weights
python # open python in terminal and run the following Python code

import torch
from torch.utils.model_zoo import load_url
from torchvision import models

sd = load_url("https://s3-us-west-2.amazonaws.com/jcjohns-models/vgg16-00b39a1b.pth")
sd['classifier.0.weight'] = sd['classifier.1.weight']
sd['classifier.0.bias'] = sd['classifier.1.bias']
del sd['classifier.1.weight']
del sd['classifier.1.bias']

sd['classifier.3.weight'] = sd['classifier.4.weight']
sd['classifier.3.bias'] = sd['classifier.4.bias']
del sd['classifier.4.weight']
del sd['classifier.4.bias']

torch.save(sd, "vgg16.pth")

cd ../..

For Resnet101, you can set up like:

mkdir -p data/imagenet_weights
cd data/imagenet_weights
# download from my gdrive (link in pytorch-resnet)
mv resnet101-caffe.pth res101.pth
cd ../..

Train (and test, evaluation)

./experiments/scripts/train.sh [GPU_ID] [DATASET] [NET] [WSDDN_PRETRAINED]
# Examples:
./experiments/scripts/train.sh 0 pascal_voc vgg16 path_to_wsddn_pretrained_model

Visualization with Tensorboard

tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &

Test and evaluate

./experiments/scripts/test.sh [GPU_ID] [DATASET] [NET] [WSDDN_PRETRAINED]
# Examples:
./experiments/scripts/test.sh 0 pascal_voc vgg16 path_to_wsddn_pretrained_model

By default, trained networks are saved under:

output/[NET]/[DATASET]/default/

Test outputs are saved under:

output/[NET]/[DATASET]/default/[SNAPSHOT]/

Tensorboard information for train and validation is saved under:

tensorboard/[NET]/[DATASET]/default/
tensorboard/[NET]/[DATASET]/default_val/