Awesome

HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields

by Haozhe Qi, Chen Zhao, Mathieu Salzmann, Alexander Mathis, EPFL (Switzerland).

Overview

We show that HOISDF achieves state-of-the-art results on hand-object pose estimation benchmarks (DexYCB and HO3Dv2).
We introduce a hand-object pose estimation network that uses signed distance fields (HOISDF) to introduce implicit 3D shape information.
This repo contains the official Pytorch implementation of HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields published at CVPR'24.

News:

August 2024: We also shared additional data: rendered images and the segmentation masks that we use to train our model on HO3Dv2 and preprocessed SDF samples and rendered data for HO3Dv2.
July 2024: We shared preprocessed data of the interacting objects, SDF samples, & trained model weights on Zenodo!
June 2024: We presented the poster at CVPR in Seattle
June 2024: We presented the poster at FENS in Vienna

Environment Installation

Clone the Current Repo

git clone git@github.com:amathislab/HOISDF.git

Setup the conda environment

conda create --name hoisdf python=3.9
conda activate hoisdf
# install the pytorch version compatible with the your cuda version
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt

Download MANO model files (MANO_LEFT.pkl and MANO_RIGHT.pkl) from the website and place them in the tool/mano_models folder.
Download the YCB models from here and set object_models_dir in config.py to point to the dataset folder. The original mesh models are large and have different vertices for different objects. To enable batched inference, we additionally use simplified object models with 1000 vertices. Download the simplified models from here and set simple_object_models_dir in config.py to point to the dataset folder
Download the processed annotation files for both datasets from here and set annotation_dir in config.py to point to the processed data folder.

Dataset Setup

Depending on the dataset you intend to train/evaluate follow the instructions below for the setup.

HO3Dv2 Setup

Download the dataset from the website and set ho3d_data_dir in config.py to point to the dataset folder.
Obtain Signed-Distance-Field (SDF) files for every sample. This is only needed for the training set. You can obtain them in either of the following ways. Set fast_data_dir in config.py to point to the processed SDF folder.
- Download the processed SDF files for HO3Dv2 training set from here.
- Follow the AlignSDF repo to generate the original SDF files. Then use the tool/pre_process_sdf.py script to process the SDF data.
If you want to train HOISDF with the rendered images, download the rendered data including the images from here as well as the SDF files from here and put them into the fast_data_dir folder.

DexYCB Setup

Download the dataset from the website and set dexycb_data_dir in config.py to point to the dataset folder.
Obtain Signed-Distance-Field (SDF) files for every sample. This is needed for both the training and test sets. You can obtain them in either of the following ways. Set fast_data_dir in config.py to point to the processed SDF folder.
- Download the processed SDF files for DexYCB test set from here and for the DexYCB full test set from here. Since the processed SDF files for DexYCB training set and full training set are too big. We unfortunately cannot share them on Zonado and would encourage you to generate them by yourself.
- Follow the AlignSDF repo to generate the original SDF files. Then use the tool/pre_process_sdf.py script to process the SDF data.

Evaluation

Depending on the dataset you intend to evaluate follow the instructions below. To test the model with our trained weights, you can download the weights from the links provided here and put them in the ckpts folder.

HO3Dv2

In the config.py, modify setting parameter.
- setting = 'ho3d' for evaluating the model only trained on the HO3Dv2 training set.
- setting = 'ho3d_render' for evaluating the model also trained on the rendered data.

Run the following command:

python main/test.py --ckpt_path ckpts/ho3d/snapshot_ho3d.pth.tar  # for ho3d setting
python main/test.py --ckpt_path ckpts/ho3d_render/snapshot_ho3d_render.pth.tar  # for ho3d_render setting

The results are dumped into a results.txt file in the folder containing the checkpoint.
Also dumped is a pred_mano.json file which can be submitted to the HO-3D (v2) challenge after zipping the file.
Hand pose estimation accuracy in the HO-3D challenge leaderboard: here, user: inspire

DexYCB

In the config.py, modify setting parameter.
- setting = 'dexycb' for evaluating the model only trained on the DexYCB split, which only includes the right hand data.
- setting = 'dexycb_full' for evaluating the model trained on the DexYCB Full split, which includes both the right and left hands data.

Run the following command:

python main/test.py --ckpt_path ckpts/dexycb/snapshot_dexycb.pth.tar  # for dexycb setting
python main/test.py --ckpt_path ckpts/dexycb_full/snapshot_dexycb_full.pth.tar  # for dexycb_full setting

The results are dumped into a results.txt file in the folder containing the checkpoint.
For the dexycb_full setting, additional hand mesh results are shown in the results.txt file (Table 3 in the paper).

Training

Depending on the dataset you intend to train follow the instructions below.

Set the output_dir in config.py to point to the directory where the checkpoints will be saved.
In the config.py, modify setting parameter.
- setting = 'ho3d' for training the model on the HO3Dv2 training set.
- setting = 'ho3d_render' for training the model also on the rendered data.
- setting = 'dexycb' for training the model on the DexYCB split, which only includes the right hand data..
- setting = 'dexycb_full' for training the model on the DexYCB Full split, which includes both the right and left hands data.
Run the following command, set the CUDA_VISIBLE_DEVICES and --gpu to the desired GPU ids. Here is an example command for training on two GPUs:
```
CUDA_VISIBLE_DEVICES=0,1 python main/train.py --run_dir_name test --gpu 0,1
```
To continue training from the last saved checkpoint, add --continue argument in the above command.
The checkpoints are dumped after every epoch in the 'output' folder of the base directory.
Tensorboard logging is also available in the 'output' folder.

Reference:

If you find our code or ideas useful, please cite:

@inproceedings{qi2024hoisdf,
  title={HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields},
  author={Qi, Haozhe and Zhao, Chen and Salzmann, Mathieu and Mathis, Alexander},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10392--10402},
  year={2024}
}

Link to CVPR article: HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields

Acknowlegements

Some of the code has been reused from Keypoint Transformer, HFL-Net, DenseMutualAttention, and AlignSDF repositories. We thank the authors for sharing their excellent work!
Our work was funded by EPFL and Microsoft Swiss Joint Research Center (H.Q., A.M.) and a Boehringer Ingelheim Fonds PhD stipend (H.Q.). We are grateful to Niels Poulsen for comments on an earlier version of this manuscript. We also sincerely thank Rong Wang, Wei Mao and Hongdong Li for sharing the hand-object rendering pipeline.