Awesome

Human Hands as Probes for Interactive Object Understanding

CVPR 2022

Mohit Goyal, Sahil Modi, Rishabh Goyal, Saurabh Gupta <br/> University of Illinois at Urbana-Champaign<br/>

<strong>Abstract:</strong> Interactive object understanding, or what we can do to objects and how is a long-standing goal of computer vision. In this paper, we tackle this problem through observation of human hands in in-the-wild egocentric videos. We demonstrate that observation of what human hands interact with and how can provide both the relevant data and the necessary supervision. Attending to hands, readily localizes and stabilizes active objects for learning and reveals places where interactions with objects occur. Analyzing the hands shows what we can do to objects and how. We apply these basic principles on the EPIC-KITCHENS dataset, and successfully learn state-sensitive features, and object affordances (regions of interaction and afforded grasps), purely by observing hands in egocentric videos.

Motivation

Prerequisites

Anaconda or miniconda is required for setting up the environment to train the models. You would also need to download EPIC-KITCHENS dataset available here and the hand-object bboxes.

conda env create --file environment.yml
conda activate humanhands

Learning State-Sensitive Features

Pretraining

Please follow the link above to reproduce the pretraining of the state-sensitive features model.

Evaluation on EPIC-STATES

This contains the EPIC-STATES dataset and evaluation code to benchmark the pretrained model.

Citing

If you find this work and/or dataset useful in your research, please cite:

@inproceedings{goyal2022human,
      title={Human Hands as Probes for Interactive Object Understanding}, 
      author={Mohit Goyal and Sahil Modi and Rishabh Goyal and Saurabh Gupta},
      year={2022},
      booktitle = {Computer Vision and Pattern Recognition (CVPR)}
}

Awesome

Human Hands as Probes for Interactive Object Understanding

CVPR 2022

Motivation

Prerequisites

Learning State-Sensitive Features

Pretraining

Evaluation on EPIC-STATES

Learning Object Affordances

Training ACP

Evaluating ROI Predictions

Evaluating Grasps afforded by objects

Citing