Home

Awesome

Information

This package contains a matlab implementation of Pose-based CNN (P-CNN) algorithm described in [1]. It includes pre-trained CNN appearance vgg-f model [2], a matlab version of the flow model of [3] and the optical flow implementation of [4]. CNN implementation uses the MatConvNet library [5]. The project webpage is http://www.di.ens.fr/willow/research/p-cnn/ .

To run this package:

demo.m

An example of P-CNN computation is given in this package. It computes P-CNN for few videos of the JHMDB dataset [6] (for 2 different splits) using pose ground truth annotations. The reproduce_ICCV15_results command reproduces the P-CNN results reported in [1]. Because we wanted to provide a "full matlab code", we converted all the code to matlab resulting to a slightly different result (-0.9% accuracy) from the published version due to the switch of the CNN package and retraining.

The provided algorithm takes as input the frames of a video and their corresponding pose joints (from ground truth annotation or from your favorite pose detector). There is a demo.m file in the package that you should be able to run.

Datasets

Two datasets have been used in our ICCV'15 paper:

param.lhandposition=11;
param.rhandposition=6;
param.upbodypositions=1:13;
param.lside = 120 ;

and in compute_pcnn_features.m:

param.partids = [1 2 3 4] ; % don't use full body part

Cite

If you use this package, please cite:

@inproceedings{cheronICCV15,<br> TITLE = {{P-CNN: Pose-based CNN Features for Action Recognition}},<br> AUTHOR = {Ch{'e}ron, Guilhem and Laptev, Ivan and Schmid, Cordelia},<br> BOOKTITLE = {ICCV},<br> YEAR = {2015},<br> }

References

[1] G. Chéron, I. Laptev, C. Schmid. P-CNN: Pose-based CNN Features for Action Recognition. ICCV 2015.

[2] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. BMVC 2014.

[3] G. Gkioxari and J. Malik. Finding action tubes. CVPR 2015. ACM 2015.

[4] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert. High accuracy optical flow estimation based on a theory for warping. ECCV 2004.

[5] A. Vedaldi and K. Lenc. MatConvNet - Convolutional Neural Networks for MATLAB.

[6] H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black. Towards understanding action recognition. ICCV 2013.

[7] M. Rohrbach, S. Amin, M. Andriluka and B. Schiele. A Database for Fine Grained Activity Detection of Cooking Activities. CVPR 2012.

Acknowledgements

We graciously thank the authors of the previous code releases and video benchmark for making them publicly available.