Home

Awesome

Pytorch Realtime Multi-Person Pose Estimation

This is a pytroch version of Realtime Multi-Person Pose Estimation, origin code is here https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation

Introduction

Code for reproducing CVPR 2017 Oral paper using pytorch

Results

<div align='center'> <img src="https://github.com/last-one/pytorch_realtime_multi-person_pose_estimation/blob/master/testing/ski.jpg", width="300", height="300"> &nbsp; <img src="https://github.com/last-one/pytorch_realtime_multi-person_pose_estimation/blob/master/testing/result.png", width="300", height="300"> </div>

The result is generated by the model, which has trained 30 epoches.

Contents

1.preprocessing: some scripts for preprocessing data.

2.training: some scripts for training networks.

3.testing: the test script and example.

4.caffe2pytorch: the script for converting.

5.caffe_model: caffe model

Require

Pytorch: 0.2.0_3

Caffe: If you want to convert the caffemodel by your own.

Instructions

Mytransforms.py: some transformer.

transformer the image, mask, keypoints and center points, together.

CocoFolder.py: to read data for network.

It will generate the PAFs vector and heatmap when get the image.

The PAFs vector's format as follow:

POSE_COCO_PAIRS = {
	{3,  4},
	{4,  5},
	{6,  7},
	{7,  8},
	{9,  10},
	{10, 11},
	{12, 13},
	{13, 14},
	{1,  2},
	{2,  9},
	{2,  12},
	{2,  3},
	{2,  6},
	{3,  17},
	{6,  18},
	{1,  16},
	{1,  15},
	{16, 17},
	{15, 18},
}

Where each index is the key value corresponding to each part in POSE_COCO_BODY_PARTS

utils.py: some common functions, such as adjust learning rate, read configuration and etc.

visualize_input.ipynb: the script to vierfy the validaity of preprocessing and generating heatmap and vectors. It shows some examples.

pose_estimation.py: the structure of networks.

The first 10 layers equals to VGG-19, so if set pretrained as True, it will be initialized by the VGG-19. And the stage is 6. The first stage has 5 layers (3 3x3conv + 2 1x1conv) and the remainder stages have 7 layers (5 3x3conv + 2 1x1conv).

TODO: the stage is adjustable.

Training steps

Notice

Citation

Please cite the paper in your publocations if it helps your research:

@InProceedings{cao2017realtime,
	title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields}},
	author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
	booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
	year = {2017}
	}

License

The repo is freely available for free non-commercial use. Please see the license for further details.