Home

Awesome

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds (ICCV 2021, Oral)

teaser

Introduction

This is the official PyTorch implementation of our paper CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds.

For more information, please visit our project page.

<span class="center"><img src="images/bmvc_ours.gif" width="45%"> <img src="images/real_drawers_ours.gif" width="45%"></span>

<p style="text-align: left; width: 90%; margin-left: 0%"><b>Result visualization on real data.</b> Our models, trained on synthetic data only, can directly generalize to real data, assuming the availability of object masks but not part masks. Left: results on a laptop trajectory from BMVC dataset. Right: results on a real drawers trajectory we captured, where a Kinova Jaco2 arm pulls out the top drawer.</p>

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{weng2021captra,
	title={CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds},
	author={Weng, Yijia and Wang, He and Zhou, Qiang and Qin, Yuzhe and Duan, Yueqi and Fan, Qingnan and Chen, Baoquan and Su, Hao and Guibas, Leonidas J.},
	booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
    	month={October},
	year={2021},
    	pages={13209-13218}
}

Updates

Installation

Datasets

NOCS-REAL275

mkdir nocs_data && cd nocs_data

Test

Train

SAPIEN Synthetic Articulated Object Dataset

mkdir sapien_data && cd sapien_data

Test

Train

Testing & Evaluation

Download Pretrained Model Checkpoints

Testing

Evaluation

Visualization

Training

File Structure

Overall Structure

The working directory should be organized as follows.

captra
├── CAPTRA		# this repository
├── data			# datasets
│   ├── nocs_data		# NOCS-REAL275
│   └── sapien_data	# synthetic dataset of articulated objects from SAPIEN
└── runs			# folders for individual experiments
    ├── 1_bottle_coord
    ├── 1_bottle_rot
    └── ...

Code Structure

<details> <summary><b>See here for an overview of our code. Only the most relevant folders/files are shown.</b> </summary> <p>
CAPTRA
├── configs		# configuration files
│   ├── all_config		# experiment configs
│   ├── pointnet_config 	# pointnet++ configs (radius, etc)
│   ├── obj_config		# dataset configs
│   └── config.py		# parser
├── datasets	# data preprocessing & dataset definitions
│   ├── arti_data		# articulated data
│   │   └── ...
│   ├── nocs_data		# NOCS-REAL275 data
│   │   ├── ...
│   │   └── preproc_nocs	# prepare nocs data
│   └── ...			# utility functions
├── pose_utils		# utility functions for pose/bounding box computation
├── utils.py
├── misc		# evaluation and visualization
│   ├── eval
│   └── visualize
├── scripts		# scripts for training/testing
└── network		# main part
    ├── data		# torch dataloader definitions
    ├── models		# model definition
    │   ├── pointnet_lib
    │   ├── pointnet_utils.py
    │   ├── backbones.py
    │   ├── blocks.py		# the above defines backbone/building blocks
    │   ├── loss.py
    │   ├── networks.py		# defines CoordinateNet and RotationNet
    │   └── model.py		# defines models for training/tracking
    ├── trainer.py	# training agent
    ├── parse_args.py		# parse arguments for train/test
    ├── test.py		# test
    ├── train.py	# train
    └── train_nocs_mix.py	# finetune with a mixture of synthetic/real data
</p> </details>

Experiment Folder Structure

<details> <summary><b>For each experiment, a dedicated folder in `captra/runs` is created. See here for its organization.</b> </summary> <p>
1_bottle_rot
├── log		# training/testing log files
│   └── log.txt
├── ckpt	# model checkpoints
│   ├── model_0001.pt
│   └── ...
└── results
    ├── data*		# per-trajectory raw network outputs 
    │   ├── bottle_shampoo_norm_scene_4.pkl
    │   └── ...
    ├── err.csv**	# per-frame error	
    └── err.pkl**	# per-frame error
*: generated after testing with --save
**: generated after running misc/eval/eval.py
</p> </details>

Dataset Folder Structure

<details> <summary><b>See here for the organization of dataset folders.</b> </summary> <p>
nocs_data
├── nocs_model_corners		# instance bounding box information	
├── nocs_full		 	# original NOCS data, organized in frames (not object-centric)
│   ├── real_test
│   │   ├── scene_1
│   │   └── ...
│   ├── real_train
│   ├── train	  		  # see the following
│   └── val			
├── instance_list*		# collects each instance's occurences in nocs_full/*/
├── render*			# per-instance segmented data for training
├── preproc**			# cashed data 	
└── splits**			# data lists for train/test	
*: generated after data-preprocessing
**: generated during training/testing

# Specifically, nocs_data/nocs_full/train (and val) should be structured as follows:
train	
├── 00000
│   ├── 0000_coord.png 		# rendered object normalized coordinates
│   ├── 0000_depth.png 		# depth image containing synthetic foreground objects only
│   ├── 0000_mask.png  		# object mask
│   ├── 0000_meta.txt  		# meta information
│   ├── 0000_composed.png* # depth image containing both synthetic foreground objects 
│   │	  							  # and the real background
│   ├── 0000_pose.pkl**		 # object poses computed from *_coord.png and *_depth.png
│   └── ...
├── 00001
└── ...
*: generated after copy-merging camera_full_depths with nocs_full
**: generated after data-preprocessing

sapien_data
├── urdf			# instance URDF models
├── render_seq			# testing trajectories
├── render**			# single-frame training/validation data
├── preproc_seq*		# cashed testing trajectory data	
├── preproc**			# cashed testing trajectory data
└── splits*			# data lists for train/test	
*: generated during training/testing
**: training
</p> </details>

Acknowledgements

This implementation is based on the following repositories. We thank the authors for open sourcing their great works!