Awesome

Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection?

Overview

We investigate the effectiveness of synthetic data in enhancing egocentric hand-object interaction detection. Via extensive experiments and comparative analyses on three egocentric datasets, VISOR , EgoHOS, and ENIGMA-51, our findings reveal how to exploit synthetic data for the HOI detection task when real labeled data are scarce or unavailable. Specifically, by leveraging only 10% of real labeled data, we achieve improvements in Overall AP compared to baselines trained exclusively on real data of: +5.67% on EPIC-KITCHENS VISOR, +8.24% on EgoHOS, and +11.69% on ENIGMA-51. Our analysis is supported by a novel data generation pipeline and the newly introduced HOI-Synth benchmark which augments existing datasets with synthetic images of hand-object interactions automatically labeled with hand-object contact states, bounding boxes, and pixel-wise segmentation masks.

Project Page - Paper

Updates

01/07/2024: Accepted at European Conference on Computer Vision (ECCV) 2024! <br>

Citation

If you use our HOI-Synth benchmark, data generation pipeline or this code for your research, please cite our paper:

@inproceedings{leonardi2025synthetic,
  title={Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection?},
  author={Leonardi, Rosario and Furnari, Antonino and Ragusa, Francesco and Farinella, Giovanni Maria},
  booktitle={European Conference on Computer Vision},
  pages={36--54},
  year={2025},
  organization={Springer}
}

HOI-Synth benchmark
Data Generation Pipeline
Baselines
License
Ackowledgements<br>

HOI-Synth benchmark

The HOI-Synth benchmark extends three egocentric datasets designed to study hand-object interaction detection, EPIC-KITCHENS VISOR [1], EgoHOS [2], and ENIGMA-51 [3], with automatically labeled synthetic data obtained through the proposed HOI generation pipeline.

Download

Synthetic-Data

You can download the synthetic data at the following links:

The format follows the standard of HOS introduced in the VISOR-HOS GitHub repository. Please refer to that link for more information.

After downloading, place the images and annotations in their respective folders.

You will find several annotation files available:

train.json: Contains the complete train annotations.
val.json: Contains the complete val annotations.
train_x.json: Contains annotations for specific percentages of data. For example, train_10.json contains annotations for 10% of the data.

Additionally, you will find combined annotations (e.g., Synthetic + VISOR). In such cases, move the images from the corresponding real dataset into the appropriate "images" folder.

For the Enigma-51 synthetic images (enigma-51_synth), there are three folders containing the different synthetic data used in the experiments (Check the paper for more information):

In-domain
Out-domain
Out-domain with FOV of the target dataset

EPIC-KITCHENS VISOR

To download the data and the corresponding annotations for EPIC-KITCHENS VISOR, follow this link: EPIC-KITCHENS VISOR Data Preparation.

EgoHOS

To download the images of EgoHOS, follow this link: EgoHOS.

We have converted the annotations into the HOS format, which can be downloaded at the following link: EgoHOS Annotations.

ENIGMA-51

You can download the ENIGMA-51 data at the following links:

For more information, visit the official ENIGMA-51 website.

Data Generation Pipeline

Coming soon!

Baselines