Home

Awesome

Optimal Naive Bayes Nearest Neighbors (oNBNN)

Optimal NBNN (oNBNN) is a C++ library for classification of objects that come under the form of sets of multi-dimensional features, such as images.

In order to understand how oNBNN works, please refer yourself to the following paper: "Towards Optimal Naive Bayes Nearest Neighbors", Behmo, Marcombes, Dalalyan, Prinet, ECCV 2010 available [here] (http://www.minutebutterfly.de/pro).

Executive summary

mkdir build && cd build
cmake ..
make
../examples/example1 ../data/c101_airplanes/ ../data/c101_car_side/ sift

Package organisation

This package is organised in three main parts:

The source code of optimal-nbnn is itself split in two main parts:

If you intend to use optimal-nbnn with your own work, what you should do is simply to link against the dynamic onbnn library (libonbnn.so) and to #include <onbnn/onbnn.h>.

Dependencies

The onbnn library requires the following dependencies for compilation:

Install under ubuntu:

sudo apt-get install cmake build-essential libglpk-dev libboost-filesystem-dev libgsl0-dev

Note that if you do not intend to use the multi-probe LSH nearest neighbor search provided in this package, you do not need the GSL library. Moreover, the boost::file_system library is only required to build the example scripts.

Build

oNBNN uses a cmake-based build system. This means that you have to execute cmake first, which will produce a makefile best suited to your environment.

mkdir build && cd build/
cmake ..
make
sudo make install

You can then run the example1 script which classifies livingroom images vs bedroom images using both optimal NBNN and normal NBNN (see Examples section below)

Usage

Datasets

For our examples, we included the first 20 images of each class from the "Fifteen Scene Categories" dataset (available [here] (http://www-cvr.ai.uiuc.edu/ponce_grp/data/ )).

We resized these images so that they all have identical maximum size (400 pixels). We sampled SIFT features from each image using using van de Sande's binary utility (available [here] (http://staff.science.uva.nl/~ksande/research/colordescriptors/)).

Provided you download van de Sande's binary utility, you can obtain the same text files containing the SIFT (or other) features of each image, by running the ruby scripts located in the data/ folder:

ruby sample_features.rb folder1/ sift
ruby sample_features.rb folder2/ sift

These two commands produce text files that contain the image features. These filenames are of the form: imagename___sift.txt

Examples

Example 1: binary, multi-channel classification

Usage:

example1 ../data/c101_airplanes/ ../data/c101_car_side/ sift

In order to best understand how oNBNN works, it is recommended to take a look at file example1-binary-classification.cpp.

Before we go any further, let us just mention that it is pretty easy for you to launch the example script on your own image data, by following these steps:

Once you have gathered the images and feature files, you are ready to start the example script.

All the ressources of the library are gathered in the onbnn namespace. onbnn::BinaryClassifier is what you will use to predict the labels of test data. Both testing and training data come under the form of onbnn::Object instances.

Help

For further help and information, please contact Régis Behmo (onbnn@behmo.com), who is the main maintainer of this project.