Home

Awesome

<b>GTA-IM Dataset</b> [Website]

<div align=center> <img src="assets/sample1.gif" width=32%> <img src="assets/sample2.gif" width=32%> <img src="assets/sample3.gif" width=32%> </div> <br>

Long-term Human Motion Prediction with Scene Context, ECCV 2020 (Oral) PDF <br> Zhe Cao, Hang Gao, Karttikeya Mangalam, Qi-Zhi Cai, Minh Vo, Jitendra Malik. <br>

This repository maintains our GTA Indoor Motion dataset (GTA-IM) that emphasizes human-scene interactions in the indoor environments. We collect HD RGB-D image seuqences of 3D human motion from realistic game engine. The dataset has clean 3D human pose and camera pose annoations, and large diversity in human appearances, indoor environments, camera views, and human activities.

Table of contents<br>

  1. A demo for playing with our dataset.<br>
  2. Instructions to request our full dataset.<br>
  3. Documentation on our dataset structure and contents.<br>

Demo

(0) Getting Started

Clone this repository, and create local environment: conda env create -f environment.yml.

For your convinience, we provide a fragment of our data in demo directory. And in this section, you will be able to play with different parts of our data using maintained tool scripts.

(1) 3D skeleton & point cloud

$ python vis_skeleton_pcd.py -h
usage: vis_skeleton_pcd.py [-h] [-pa PATH] [-f FRAME] [-fw FUSION_WINDOW]

# now visualize demo 3d skeleton and point cloud!
$ python vis_skeleton_pcd.py -pa demo -f 2720 -fw 80

You should be able to see a open3d viewer with our 3D skeleton and point cloud data, press 'h' in the viewer to see how to control the viewpoint: <img src="assets/vis_skeleton_pcd.gif" width=100%>

Note that we use open3d == 0.7.0, the visualization code is not compatible with the newer version of open3d.

(2) 2D skeleton & depth map

$ python vis_2d_pose_depth.py -h
usage: vis_2d_pose_depth.py [-h] [-pa PATH]

# now visualize 2d skeleton and depth map!
$ python vis_2d_pose_depth.py -pa demo

You should be able to find a created demo/vis/ directory with *_vis.jpg that render to a movie strip like this: <img src="assets/vis_2d_pose_depth.gif" width=80%>

(3) RGB video

$ python vis_video.py -h
usage: vis_video.py [-h] [-pa PATH] [-s SCALE] [-fr FRAME_RATE]

# now visualize demo video!
$ python vis_video.py -pa demo -fr 15

You should be able to find a created demo/vis/ directory with a video.mp4:

Requesting Dataset

To obtain the Dataset, please send an email to Zhe Cao (with the title "GTA-IM Dataset Download") stating:

We will promptly reply with the download link.

Dataset Contents

After you download data from our link and unzip, each sequence folder will contain the following files:

<br> <br> <br>

Joint Types

The human skeleton connection and joints index name:

LIMBS = [
    (0, 1),  # head_center -> neck
    (1, 2),  # neck -> right_clavicle
    (2, 3),  # right_clavicle -> right_shoulder
    (3, 4),  # right_shoulder -> right_elbow
    (4, 5),  # right_elbow -> right_wrist
    (1, 6),  # neck -> left_clavicle
    (6, 7),  # left_clavicle -> left_shoulder
    (7, 8),  # left_shoulder -> left_elbow
    (8, 9),  # left_elbow -> left_wrist
    (1, 10),  # neck -> spine0
    (10, 11),  # spine0 -> spine1
    (11, 12),  # spine1 -> spine2
    (12, 13),  # spine2 -> spine3
    (13, 14),  # spine3 -> spine4
    (14, 15),  # spine4 -> right_hip
    (15, 16),  # right_hip -> right_knee
    (16, 17),  # right_knee -> right_ankle
    (14, 18),  # spine4 -> left_hip
    (18, 19),  # left_hip -> left_knee
    (19, 20)  # left_knee -> left_ankle
]

Important Note

This dataset is for non-commercial research purpose only. Due to public interest, I decided to reimplement the data generation pipeline from scratch to collect the GTA-IM dataset again. I do not use Facebook resources to reproduce the data.

Citation

We believe in open research and we will be happy if you find this data useful. If you use it, please consider citing our work.

@incollection{caoHMP2020,
  author = {Zhe Cao and
    Hang Gao and
    Karttikeya Mangalam and
    Qizhi Cai and
    Minh Vo and
    Jitendra Malik},
  title = {Long-term human motion prediction with scene context},
  booktitle = ECCV,
  year = {2020},
  }

Acknowledgement

Our data collection pipeline was built upon this plugin and this tool.

LICENSE

Our project is released under CC-BY-NC 4.0.