Home

Awesome

Leveraging Unlabeled Data for Crowd Counting by Learning to Rank

The paper will appear in CVPR 2018. An arXiv pre-print version is available.

The updated version is accpeted at IEEE Transactions on Pattern Analysis and Machine Intelligence. Here is arXiv pre-print version.

Citation

Please cite our paper if you are inspired by the idea.

@inproceedings{xialei2018crowd,
title={Leveraging Unlabeled Data for Crowd Counting by Learning to Rank},
author={Liu, Xialei and van de Weijer, Joost and Bagdanov, Andrew D},
booktitle={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2018},
url = {https://github.com/xialeiliu/CrowdCountingCVPR18}
}

and

@ARTICLE{8642842, 
author={X. {Liu} and J. {Van De Weijer} and A. D. {Bagdanov}}, 
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
title={Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank}, 
year={2019}, 
pages={1-1}, 
doi={10.1109/TPAMI.2019.2899857}, 
ISSN={0162-8828}, }

Authors

Xialei Liu, Joost van de Weijer and Andrew D. Bagdanov

Institutions

Computer Vision Center, Barcelona, Spain

Media Integration and Communication Center, University of Florence, Florence, Italy

Abstract

We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and queryby-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-ofthe-art results.

Framework

The main idea of our approach is to address the problem of limited Crowd Counting dataset size, which allows us to leverage abundantly available unlabeled crowd imagery in a learning-to-rank framework.

Models

Requirments

All training and test are done in Caffe framework.

  1. Requirements for caffe and pycaffe (see: Caffe installation instructions). Caffe must be built with support for Python layers!
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1

  1. Download the pre-trained VGG-16 ImageNet model for finetuning.

Pre-trained models

The pre-trained models are available to download.

Useful tools

We use the code from here to download and prepare the datasets, generate the density maps and evalate the models.