Home

Awesome

Update: Detectron2 version of matrixnets coming out soon

MatrixNets

MatrixNetis a scale and aspect ratio aware deep learning architecture for object detection. We implemented matrixnets anchors (centers) and corners. For more details, please refer to the papers linked below.

<p align="center"> <img src="https://github.com/arashwan/matrixnet/blob/master/images/figure5.png" height="250px"> </p>

We have two implementations based on Corners and Anchor (Centers):

<p align="center"> <img src="https://github.com/arashwan/matrixnet/blob/master/images/figure6.png" height="200px" caption= "Corners"> <img src="https://github.com/arashwan/matrixnet/blob/master/images/figure_centers.png" height="200px"> </p>

Training and Evaluation Code

Code for reproducing the results in the following paper:

Matrix Nets (ICCV'19) (short paper)
Matrix Nets (long paper)

Selecting Layers in MatrixNets

One of the capabilities offered by MatrixNets is to be able to choose which layers to use for training and inference. Although we used 19 layers matrixnet in the paper, we implemented matrixnet here such that any matrixnet design can be specified by setting the layer_range variable in the config file. The layer_range is defined as a 3D matrix were the outer matrix is 5x5, and each entry of this matrix is either a 1D matrix of [y_min, y_max, x_min, x_max] or -1 if we do not want to include this layer.

Example 1:

In the paper, we use a 19-layer MatrixNet by ignoring the left top and bottom right corners of the 5x5 matrix. The range for the base layer (top left) is [24,48,24,48].

The corresonding layer range would look like:

[[[0,48,0,48],[48,96,0,48],[96,192,0,48], -1, -1], [[0,48,48,96],[48,96,48,96],[96,192,48,96],[192,384,0,96], -1], [[0,48,96,192],[48,96,96,192],[96,192,96,192],[192,384,96,192],[384,2000,96,192]], [-1, [0,96,192,384],[96,192,192,384],[192,384,192,384],[384,2000,192,384]], [-1, -1, [0,192,384,2000],[192,384,384,2000],[384,2000,384,2000]]]

Note that we extended the range for the layers on the boundary to include any objects that are out of range.

Performance

Following table gives the AP for Corners and Anchors with different backbones (from the paper):

BackboneCentersCorners
Resnet-50-X41.041.3
Resnet-101-X42.342.3
Resnet-152-X43.644.7

Note that these numbers are reported from the validation set, whereas the final numbers in the paper are reported on the test set.

Sample Images with Detections for both Architectures

alt text

Getting Started

Installing Packages

Using Conda

Please first install Anaconda and create an Anaconda environment using the provided package list.

conda create --name matrixnets --file packagelist_conda.txt

After one creates the environment, activate it.

source activate matrixnets

Using Pip

Alternatively, one can use pip and install all packages from the requirements file. Note we are using python 3.6+. Torch 1.2.0 and torchvision 0.4.0

pip install -r requirements.txt

Our current implementation only supports GPU, so one needs a GPU and need to have CUDA(9+) installed on your machine.

Compiling NMS

You also need to compile the NMS code (originally from Faster R-CNN and Soft-NMS).

cd <Matrixnet dir>/external
make

Downloading MS COCO Data

Training and Evaluation

To train and evaluate a network, one will need to create a configuration file, which defines the hyperparameters, and a model file, which defines the network architecture. The configuration file should be in JSON format and placed in config/. Each configuration file should have a corresponding model file in models/ (specified by model_name in the config file). i.e.

To train a model:

python train.py <config_file>

We provided four different configuration files under config directory for training both MatrixNetCorners and MatrixNetCenters.

To train MatrixNets:

python train.py MatrixNetsCornersResnet50

To evaluate the trained model:

python test.py <config_file> --testiter <iter> --split validation

--debug flag can be used to save the first 200 images with detections under results directory.

Pretrained Models

We provide pre-trained models for Resnet-50 and Resnet-152 for both Anchors and Corners.

Resnet-50+Corners
Resnet-152+Corners
Resnet-50+Anchors
Resnet-152+Anchors

Please copy the pre-trained models into the following directory under matrixnets.

'matrixnets/<cache_dir>/nnet/<model_name>/<name>'

Here cache_name is the name of the directory specified in config.json and name should be in the format <model_iters.pkl>

Note that the results might be slightly different from the paper (+/- 0.2 MAP) since we reproduced all experiments using only 4 GPUs. We could not fit the batch size of 23 for the anchors' experiments, so we ran the experiments for longer iterations to compensate for the smaller batch size.

List of avialble configuration options:

OptionMeaningAllowed Values
datasetSpecify standard dataMSCOCO
batch_sizeSpecify batch sizeAt least 1
chunk_sizesSize of chunk as a array of dim #GPU that sums to batch_size
model_namespecifying model (also picks the sampling function with the same name)MatrixNetsCorners, MatrixNetAnchors
train_splitSpcify train set
val_splitSpecify Validation Set
opt_algoSpecify Optimization Algorithmadam
learning_rateSpecify learning rate
decay_rateSpecify learning rate decay
max_iterMaximum number of Iterations
stepsizeNumber of iterations for each learning rate decay
snapshotSnapshot interval
cache_dirdirectory to store snapshots
data_dirdirectory data is stored
rand_scale_minRandom Scaling Minimum Limt
rand_scale_maxRandom Scaling Maximum Limt
rand_scale_stepRandom Scaling Steps
rand_scalesRandom Scaling
rand_cropRandom Cropping
rand_colorRandom Colouring
gaussian_bumpGaussian Bump
gaussian_iouIOU
input_sizeTraining image size1d list that looks like [width, height]
output_kernel_sizeThis helps smoothing the heatmaps to get the max detections
base_layer_ranggethe input size for images in matrixnetanchors.[y_min, y_max, x_min, x_max]
layers_range3D matrix of Layer Range -1 inbdicating which layer to ignore
test_image_max_dimmax dim of input image
test_scalestest scales (if you want to test doing multiscale)List of scales
test_flip_imagesflip flagTrue, False
cutoutcutout flagTrue, False
top_kNumber of top k detections per layerInteger
categoriesnumber of classes
matching_thresholdMatching threshold
nms_thresholdNMS threshold
max_per_imageMax detections per image
merge_bboxMerge bbox flagTrue, False
weight_expexponential weighting specification
backboneBackbone for Matrix Netsresnet50, resnet100, resnet152, resnext101

Contributions

Contributions to this project are welcome. Please make a pull request and we will attend to it as soon as possible.

Also if you extend this model to other datasets or build cool projects using it we'd love to hear from you.

Acknowledgements

Our code is based on CornetNets