Home

Awesome

Update: PyTorch based Hand Detector

If you want to use a PyTorch based hand detector, check this respository ContactHands for our NeurIPS'20 paper on hand detection and contact estimation. Using this code, we can not only detect hands but also obtain their contact information.

Contextual Attention for Hand Detection in the Wild

This repository contains the code and datasets related to the following paper:

Contextual Attention for Hand Detection in the Wild, International Conference on Computer Vision (ICCV), 2019.

Please also see the website related to the project.

Contents

This repository contains the following:

Installation

Install the required dependecies as in requirements.txt.

Data annotation format

Create annotations in a .txt file in the follwing format:

/path/to/image/, x_min, x_max, y_min, y_max, x1, y1, x2, y2, x3, y3, y4, hand, where

Folder structure

The code is organized in the following structure:

datasets/
  annotations/
    train_annotations.txt
    val_annotations.txt
    test_annotations.txt

mrcnn/
  config.py
  model.py
  parallel_model.py
  utils.py
  visualize.py
  contextual_attention.py

samples/
  hand/
    results/
    hand.py
    load_weights.py

model/
  mask_rcnn_coco.h5
  trained_weights.h5

requirements.txt
setup.cfg
setup.py

Models

Download the models from models and place them in ./model/.

Training

Use the following command to train Hand-CNN:

python -W ignore samples/hand/hand.py --weight coco --command train

The training set to be used can be specified in train function of the file ./samples/hand/hand.py.

Detection

Use the following to run detection on images and visualize them:

python -W ignore detect.py --image_dir /path/to/folder_containing_images/

The outputs will be stored ./outputs/.

Evaluation

Use the following command to to evaluate a trained Hand-CNN:

python -W ignore samples/hand/hand.py --weight path/to/weights --command test --testset oxford

Datasets

As a part of this project we release two datasets, TV-Hand and COCO-Hand. The TV-Hand dataset contains hand annotations for 9.5K image frames extracted from the ActionThread dataset. The COCO-Hand dataset contains annotations for 25K images of the Microsoft's COCO dataset.

Some images and annotations from the TV-Hand data

<p float="center"> <img src="./sample_data/1.png" height="180" width="280" /> <img src="./sample_data/2.png" height="180" width="280" /> <img src="./sample_data/3.png" height="180" width="280" /> <img src="./sample_data/4.png" height="180" width="280" /> <img src="./sample_data/5.png" height="180" width="280" /> <img src="./sample_data/6.png" height="180" width="280" /> </p>

Some images and annotations from the COCO-Hand data

<p float="left"> <img src="./sample_data/7.png" height="180" width="280" /> <img src="./sample_data/8.png" height="180" width="280" /> <img src="./sample_data/9.png" height="180" width="280" /> <img src="./sample_data/10.png" height="180" width="280" /> <img src="./sample_data/11.png" height="180" width="280" /> <img src="./sample_data/12.png" height="180" width="280" /> </p>

Citation

If you use the code or datasets in this repository, please cite the following paper:

@article{Hand-CNN,
  title={Contextual Attention for Hand Detection in the Wild},
  author={Supreeth Narasimhaswamy and Zhengwei Wei and Yang Wang and Justin Zhang and Minh Hoai},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2019},
  url={https://arxiv.org/pdf/1904.04882.pdf} 
}