Awesome

xR-EgoPose

The xR-EgoPose Dataset has been introduced in the paper "xR-EgoPose: Egocentric 3D Human Pose from an HMD Camera" (ICCV 2019, oral). It is a dataset of ~380 thousand photo-realistic egocentric camera images in a variety of indoor and outdoor spaces.

The code contained in this repository is a PyTorch implementation of the data loader with additional evaluation functions for comparison.

Citation

@inproceedings{tome2019xr,
  title={xR-EgoPose: Egocentric 3D Human Pose from an HMD Camera},
  author={Tome, Denis and Peluse, Patrick and Agapito, Lourdes and Badino, Hernan},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={7728--7738},
  year={2019}
}
@ARTICLE{tome2020self,
  author={D. {Tome} and T. {Alldieck} and P. {Peluse} and G. {Pons-Moll} and L. {Agapito} and H. {Badino} and F. {De la Torre}},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title={SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera},
  year={2020},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TPAMI.2020.3029700}
}

The license agreement for the data usage implies citation of the paper. Please notice that citing the dataset URL instead of the publication would not be compliant with this license agreement.

Download on Mac OS and Linux

Make sure pigz and wget are installed:

# on Mac OS
brew install wget pigz
# on Ubuntu
sudo apt-get install pigz

To download and decompress the dataset use the download.sh script:

./download.sh

which will dowload and set-up the dataset folder for training and testing the model. Make sure to have ~1TB free space for storing the data. After that, run demo.py. This shows how to load and evaluate the model.

xR-EgoPose Dataset

Character names in the dataset follow the convention gender_id_body-type_height

gender: male/female
id: integer
body-type: a/f (average/full)
height: a/s (average/short)

Train-set	Test-set	Val-set
female_001_a_a	female_004_a_a	male_008_a_a
female_002_a_a	female_008_a_a
female_002_f_s	female_010_a_a
female_003_a_a	female_012_a_a
female_005_a_a	female_012_f_s
female_006_a_a	male_001_a_a
female_007_a_a	male_002_a_a
female_009_a_a	male_004_f_s
female_011_a_a	male_006_a_a
female_014_a_a	male_007_f_s
female_015_a_a	male_010_a_a
male_003_f_s	male_014_f_s
male_004_a_a
male_005_a_a
male_006_f_s
male_007_a_a
male_008_f_s
male_009_a_a
male_010_f_s
male_011_f_s
male_014_a_a

Structure

For each set and for each character the structure is identical, and structured as follows

TrainSet
├── female_001_a_a
│   ├── env 01
│   │   └── cam_down
│   │   	├── depth
│   │   	├── json
│   │   	├── objectId
│   │   	├── rgba
│   │   	├── rot
│   │   	└── worldp
│   ├── ...
│   └── env 03
└── ...

Frame information is organized in different folders, each containing one file per frame

depth: 8-bit png per frame
json: json file with camera and pose information
objectId: semantic segmentation
rgba: 8-bit png per frame
rot: json file with joint rotations
worldp: world position per pixel

Actions

A set of nine broad action categories have been included in the dataset

Action Name
Gaming
Gesticulating
Greeting
Lower Stretching
Patting
Reacting
Talking
Upper Stretching
Walking

where each of those categories is the collection of many different and specific actions.

E.g. Gaming includes Boxing, Shooting Gun, Playing Golf, Playing Baseball just to cite a few.

Results

Action	Martinez [1]	Ours - single branch	Ours - dual branch
Gaming	109.6	138.3	56.0
Gesticulating	105.4	108.5	50.2
Greeting	119.3	100.3	44.6
Lower Stretching	125.8	133.3	51.1
Patting	93.0	117.8	59.4
Reacting	119.7	175.6	60.8
Talking	111.1	93.5	43.9
Upper Stretching	124.5	129.0	53.9
Walking	130.5	131.9	57.7
All (mm)	122.1	130.4	58.2

[1] Julieta Martinez, Rayat Hossain, Javier Romero, and James JLittle. A simple yet effective baseline for 3d human pose estimation. In Proceedings of the International Conference on Computer Vision (ICCV), 2017

License

See the LICENSE file for details.