Awesome

LDDP: Learning Detection with Diverse Proposals

By Samaneh Azadi, Jiashi Feng, Trevor Darrell at UC Berkeley.

Introduction:

LDDP is proposed to predict a set of diverse and informative proposals with enriched representations which is able to augment object detection architectures. LDDP considers both label-level contextual information and spatial layout relationships between object proposals without increasing the number of parameters of the network, and thus improves location and category specifications of final detected bounding boxes substantially during both training and inference schemes. This implementation is built based on Faster R-CNN framework but can be modified for other detection architectures. For more information on LDDP, please refer to the arxiv preprint which will be published at CVPR 2017.

License

LDDP is licensed for open non-commercial distribution under the UC Regents license; see LICENSE. Its dependencies, such as Caffe and Faster R-CNN, are subject to their own respective licenses.

Citing LDDP

If you find LDDP useful in your research, please cite:

@article{azadi2017learning,
  title={Learning Detection with Diverse Proposals},
  author={Azadi, Samaneh and Feng, Jiashi and Darrell, Trevor},
  journal={arXiv preprint arXiv:1704.03533},
  year={2017}
}

Requirements and installation instructions are similar to Faster R-CNN, but we mention them again for your convenience.

Requirements: software

Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download my Makefile.config for reference.

Python packages you might not have: cython, python-opencv, easydict

Requirements: hardware

Hardware requirements are similar to the those for running Faster R-CNN.

Installation

Clone the LDDP repository

# Make sure to clone with --recursive
git clone --recursive https://github.com/azadis/LDDP.git

We'll call the directory that you cloned LDDP into LDDP_ROOT
Build the Cython modules
```
cd $LDDP_ROOT/py-faster-rcnn/lib
make
```

Build Caffe and pycaffe

cd $LDDP_ROOT/py-faster-rcnn/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
#   http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe

Installation for training and testing models

Download the training, validation, test data and VOCdevkit

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directory named VOCdevkit

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar

It should have this basic structure

$VOCdevkit/                           # development kit
$VOCdevkit/VOCcode/                   # VOC utility code
$VOCdevkit/VOC2007                    # image sets, annotations, etc.
# ... and several other directories ...

Create symlinks for the PASCAL VOC dataset
```
cd $LDDP_ROOT/py-faster-rcnn/data
ln -s $VOCdevkit VOCdevkit2007
```
Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
[Optional] follow similar steps to get PASCAL VOC 2010 and 2012.
[Optional] If you want to use COCO, please see the notes here.
Follow the next sections to download pre-trained ImageNet models.

Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.

cd $LDDP_ROOT/py-faster-rcnn
./data/scripts/fetch_imagenet_models.sh

Usage

To train and test the LDDP end-to-end detection framework:

cd $LDDP_ROOT/py-faster-rcnn
./experiments/scripts/LDDP_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701 TRAIN.SCALES [400,500,600,700]

Trained LDDP networks are saved under:

output/<experiment directory>/<dataset name>/

Test outputs are saved under:

output/<experiment directory>/<dataset name>/<network snapshot name>/

Semantic Similarity matrices used in the paper are stored as pickle files at:

$LDDP_ROOT/data

An example ipython script to generate semantic similarity matrices for PASCAL VOC and COCO data sets is located at:

$LDDP_ROOT/tools/Semantic_Similarity.ipynb