Home

Awesome

Slicing Convolutional Neural Network for Crowd Video Understanding

This is the source code for "Slicing Convolutional Neural Network for Crowd Video Understanding". It aims at learning generic spatio-temporal features from crowd videos, especially for long-term temporal learning (i.e. 100 frames).

Overview

Three-branch Slicing CNN model (i.e. xy-, xt-, and yt-branch)

Crowd attribute recognition (i.e. 94 crowd-related attributes)

Project Site

Caffe

A fork of the well-known Caffe framework with Multi-GPU training and Dimension Swap layer.

Apart from the official installation prerequisites, we have several other dependencies:

  1. Install openmpi to allow multi-gpu running
  2. Python packages (e.g. numpy, scipy, scikit-image, etc.)
  3. Add export PYTHONPATH="[path_python_layer]:$PYTHONPATH" to ~/.bashrc and restart the terminal. Here [path_python_layer] indicates the absolute path of the python script of py_dim_swap_layer.py.

Get the Caffe code

git clone --recursive https://github.com/amandajshao/Slicing-CNN.git

Files

Related Projects

Deeply Learned Attributes for Crowd Scene Understanding

Thanks

Citation

J. Shao, C. C. Loy, K. Kang, and X. Wang. Slicing Convolutional Neural Network for Crowd Video Understanding. Computer Vision and Pattern Recognition (CVPR), 2016.

@article{shao2016scnn,
  title={Slicing Convolutional Neural Network for Crowd Video Understanding},
  author={Shao, Jing and Loy, Chen Change and Kang, Kai and Wang, Xiaogang},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2016}
}