Home

Awesome

<br/> <div align="center"> <img src="resources/logo.jpg" width="600"/> </div> <br/>

Human-Art

This repository contains the implementation of the following paper:

Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes [Project Page] [Paper] [Code] [Data] [Video] <br> Xuan Ju<sup>∗12</sup>, Ailing Zeng<sup>∗1</sup>, Jianan Wang<sup>1</sup>, Qiang Xu<sup>2</sup>, Lei Zhang<sup>1</sup><br> <sup></sup> Equal contribution <sup>1</sup>International Digital Economy Academy <sup>2</sup>The Chinese University of Hong Kong

Table of Contents

General Description

<div align="center"> <img src="resources/dataset_overview.png" width="90%"> </div>

This paper proposes a large-scale dataset, Human-Art, that targets multi-scenario human-centric tasks to bridge the gap between natural and artificial scenes. It includes twenty high-quality human scenes, including natural and artificial humans in both 2D representation (yellow dashed boxes) and 3D representation (blue solid boxes).

Contents of Human-Art:

Tasks that Human-Art targets for:

Dataset Download

Under the CC-license, Human-Art is available for download. Fill out this form to request authorization to use Human-Art for non-commercial purposes. After you submit the form, an email containing the dataset will be instantly delivered to you. Please do not share or transfer the data privately.

For convenience of usage, Human-Art is processed using the same format as MSCOCO. Please save the dataset with the following file structure after downloading (we also include the file structure of COCO because we use it for joint training of COCO and Human-Art):

|-- data
    |-- HumanArt
        |-- annotations 
            |-- training_coco.json
            |-- training_humanart.json
            |-- training_humanart_coco.json
            |-- training_humanart_cartoon.json
            |-- ...
            |-- validation_coco.json
            |-- validation_humanart.json
            |-- validation_humanart_coco.json
            |-- validation_humanart_cartoon.json
            |-- ...
        |-- images
            |-- 2D_virtual_human
                |-- ...
            |-- 3D_virtual_human
                |-- ...
            |-- real_human
                |-- ...
    |-- coco
        |-- annotations 
        |-- train2017 
        |-- val2017 

Noted that we have several different json settings:

The annotation json files of Human-Art is described as follows:

{
    "info":{xxx}, # some basic information of Human-Art
    "images":[
        {
            "file_name": "xxx" # the path of the image (same definition with COCO)
            "height": xxx, # the image height (same definition with COCO)
            "width": xxx, # the image width (same definition with COCO)
            "id": xxx, # the image id (same definition with COCO)
            "page_url": "xxx", # the web link of the page containing the image
            "image_url": "xxx", # the web link of the image
            "picture_name": "xxx", # the name of the image
            "author": "xxx", # the author of the image
            "description": "xxx", # the text description of the image
            "category": "xxx"  # the scenario of the image (e.g. cartoon)
        },
        ...
    ],
    "annotations":[
        {
            "keypoints":[xxx], # 17 COCO keypoints' position (same definition with COCO)
            "keypoints_21":[xxx], # 21 Human-Art keypoints' position 
            "self_contact": [xxx], # self contact keypoints, x1,y1,x2,y2...
            "num_keypoints": xxx, # annotated keypoints (not invisible) in 17 COCO format keypoints (same definition with COCO)
            "num_keypoints_21": xxx, # annotated keypoints (not invisible) in 21 Human-Art format keypoints 
            "iscrowd": xxx, # annotated or not (same definition with COCO)
            "image_id": xxx, # the image id (same definition with COCO)
            "area": xxx, # the human area (same definition with COCO)
            "bbox": [xxx], # the human bounding box (same definition with COCO)
            "category_id": 1, # category id=1 means it is a person category  (same definition with COCO)
            "id": xxx, # annotation id (same definition with COCO)
            "annotator": xxx # annotator id
        }
    ],
    "categories":[] # category infromation (same definition with COCO)
}

Human Pose Estimation

Human pose estimators trained on Human-Art is now supported in MMPose in this pr. The detailed usage and Model Zoo can be found in MMPose's documents: (1) ViTPose, (2) HRNet, and (3) RTMPose.

To train and evaluate human pose estimators, please refer to MMPose. Due to the frequent update of MMPose, we do not maintain a codebase in this repo. Since Human-Art is compatible with MSCOCO, you can train and evaluate any model in MMPose using its dataloader.

The supported model include (xx-coco means trained on MSCOCO only and xx-humanart-coco means trained on Human-Art and MSCOCO):

Results of ViTPose on Human-Art validation dataset with ground-truth bounding-box

With classic decoder

ArchInput SizeAPAP<sup>50</sup>AP<sup>75</sup>ARAR<sup>50</sup>ckptlog
ViTPose-S-coco256x1920.5070.7580.5310.5510.780ckptlog
ViTPose-S-humanart-coco256x1920.7380.9050.8020.7680.911ckptlog
ViTPose-B-coco256x1920.5550.7820.5900.5990.809ckptlog
ViTPose-B-humanart-coco256x1920.7590.9050.8230.7900.917ckptlog
ViTPose-L-coco256x1920.6370.8380.6890.6770.859ckptlog
ViTPose-L-humanart-coco256x1920.7890.9160.8450.8190.929ckptlog
ViTPose-H-coco256x1920.6650.8600.7150.7010.871ckptlog
ViTPose-H-humanart-coco256x1920.8000.9260.8550.8280.933ckptlog

Results of HRNet on Human-Art validation dataset with ground-truth bounding-box

With classic decoder

ArchInput SizeAPAP<sup>50</sup>AP<sup>75</sup>ARAR<sup>50</sup>ckptlog
pose_hrnet_w32-coco256x1920.5330.7710.5620.5740.792ckptlog
pose_hrnet_w32-humanart-coco256x1920.7540.9060.8120.7830.916ckptlog
pose_hrnet_w48-coco256x1920.5570.7820.5930.5950.804ckptlog
pose_hrnet_w48-humanart-coco256x1920.7690.9060.8250.7960.919ckptlog

Results of RTM-Pose on Human-Art validation dataset with ground-truth bounding-box

ArchInput SizeAPAP<sup>50</sup>AP<sup>75</sup>ARAR<sup>50</sup>ckptlog
rtmpose-t-coco256x1920.4440.7250.4530.4880.750ckptlog
rtmpose-t-humanart-coco256x1920.6550.8720.7200.6930.890ckptlog
rtmpose-s-coco256x1920.4800.7390.4980.5210.763ckptlog
rtmpose-s-humanart-coco256x1920.6980.8930.7680.7320.903ckptlog
rtmpose-m-coco256x1920.5320.7650.5630.5710.789ckptlog
rtmpose-m-humanart-coco256x1920.7280.8950.7910.7590.906ckptlog
rtmpose-l-coco256x1920.5640.7890.6020.5990.808ckptlog
rtmpose-l-humanart-coco256x1920.7530.9050.8120.7830.915ckptlog

Human Detection

Human detectors trained on Human-Art is now supported in MMPose in this pr. The detailed usage and Model Zoo can be found here.

To train and evaluate human detectors, please refer to MMDetection, which is an open source object detection toolbox based on PyTorch that support diverse detection frameworks with higher efficiency and higher accuracy. Due to the frequent update of MMDetection, we do not maintain a codebase in this repo. Since Human-Art is compatible with MSCOCO, you can train and evaluate any model in MMDetection using its dataloader.

The supported model include:

Detection ConfigModel AP<sup><br>Download
RTMDet-tiny46.6Det Model
RTMDet-s50.6Det Model
YOLOX-nano38.9Det Model
YOLOX-tiny47.7Det Model
YOLOX-s54.6Det Model
YOLOX-m59.1Det Model
YOLOX-l60.2Det Model
YOLOX-x61.3Det Model

Citing Human-Art

If you find this repository useful for your work, please consider citing it as follows:

@inproceedings{ju2023human,
    title={Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes},
    author={Ju, Xuan and Zeng, Ailing and Wang, Jianan and Xu, Qiang and Zhang, Lei},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023},
}