Awesome

PVN3D

This is the official source code for PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation, CVPR 2020. (PDF, Video_bilibili, Video_youtube).

News

We've released the source code of our CVPR2021 oral work, FFB6D in this repo, which is faster and more accurate for 6D pose estimation! FFB6D introduce a general framework for representation learning from a single RGBD image, and we cascade prediction headers of PVN3D for 6D pose estimation.
We optimized and applied PVN3D to a robotic manipulation contest OCRTOC (IROS 2020: Open Cloud Robot Table Organization Challenge) and got 2nd place! The model was trained on synthesis data generated from rendering engine and only a few frames (about 100 frames) of real data but generalize to real scenes during inference, revealing its capability of cross-domain generalization.

Installation

The following setting is for pytorch 1.0.1. For pytorch 1.5 & cuda 10, switch to branch pytorch-1.5.
Install CUDA9.0
Set up python environment from requirement.txt:
```
pip3 install -r requirement.txt 
```
Install tkinter through sudo apt install python3-tk
Install python-pcl.
Install PointNet++ (refer from Pointnet2_PyTorch):
```
python3 setup.py build_ext
```

Datasets

LineMOD: Download the preprocessed LineMOD dataset from here (refer from DenseFusion). Unzip it and link the unzipped Linemod_preprocessed/ to pvn3d/datasets/linemod/Linemod_preprocessed:
```
ln -s path_to_unzipped_Linemod_preprocessed pvn3d/dataset/linemod/
```
YCB-Video: Download the YCB-Video Dataset from PoseCNN. Unzip it and link the unzippedYCB_Video_Dataset to pvn3d/datasets/ycb/YCB_Video_Dataset:
```
ln -s path_to_unzipped_YCB_Video_Dataset pvn3d/datasets/ycb/
```

Training and evaluating

Training on the LineMOD Dataset

First, generate synthesis data for each object using scripts from raster triangle.
Train the model for the target object. Take object ape for example:
```
cd pvn3d
python3 -m train.train_linemod_pvn3d --cls ape
```
The trained checkpoints are stored in train_log/linemod/checkpoints/{cls}/, train_log/linemod/checkpoints/ape/ in this example.

Evaluating on the LineMOD Dataset

Start evaluation by:

# commands in eval_linemod.sh
cls='ape'
tst_mdl=train_log/linemod/checkpoints/${cls}/${cls}_pvn3d_best.pth.tar
python3 -m train.train_linemod_pvn3d -checkpoint $tst_mdl -eval_net --test --cls $cls

You can evaluate different checkpoint by revising tst_mdl to the path of your target model.

We provide our pre-trained models for each object at onedrive link, baiduyun link (access code(提取码): 8kmp). Download them and move them to their according folders. For example, move the ape_pvn3d_best.pth.tar to train_log/linemod/checkpoints/ape/. Then revise tst_mdl=train_log/linemod/checkpoints/ape/ape_pvn3d_best.path.tar for testing.

Demo/visualizaion on the LineMOD Dataset

After training your models or downloading the pre-trained models, you can start the demo by:
```
# commands in demo_linemod.sh
cls='ape'
tst_mdl=train_log/linemod/checkpoints/${cls}/${cls}_pvn3d_best.pth.tar
python3 -m demo -dataset linemod -checkpoint $tst_mdl -cls $cls
```
The visualization results will be stored in train_log/linemod/eval_results/{cls}/pose_vis

Training on the YCB-Video Dataset

Preprocess the validation set to speed up training:

cd pvn3d
python3 -m datasets.ycb.preprocess_testset

Start training on the YCB-Video Dataset by:
```
python3 -m train.train_ycb_pvn3d
```
The trained model checkpoints are stored in train_log/ycb/checkpoints/

Evaluating on the YCB-Video Dataset

Start evaluating by:

# commands in eval_ycb.sh
tst_mdl=train_log/ycb/checkpoints/pvn3d_best.pth.tar
python3 -m train.train_ycb_pvn3d -checkpoint $tst_mdl -eval_net --test

You can evaluate different checkpoint by revising the tst_mdl to the path of your target model.

We provide our pre-trained models at onedrive link, baiduyun link (access code(提取码): h2i5). Download the ycb pre-trained model, move it to train_log/ycb/checkpoints/ and modify tst_mdl for testing.

Demo/visualizaion on the YCB-Video Dataset

After training your model or downloading the pre-trained model, you can start the demo by:
```
# commands in demo_ycb.sh
tst_mdl=train_log/ycb/checkpoints/pvn3d_best.pth.tar
python3 -m demo -checkpoint $tst_mdl -dataset ycb
```
The visualization results will be stored in train_log/ycb/eval_results/pose_vis

Results

Evaluation result on the LineMOD dataset:
Evaluation result on the YCB-Video dataset:
Visualization of some predicted poses on YCB-Video dataset:
Joint training for distinguishing objects with similar appearance but different in size:

Adaptation to New Dataset

Compile the FPS scripts (refer from clean-pvnet):

  cd lib/utils/dataset_tools/fps/
  python3 setup.py build_ext --inplace

Generate information of objects, eg. radius, 3D keypoints, etc. in your datasets with the gen_obj_info.py script:

  cd ../
  python3 gen_obj_info.py --help

Modify info of your new dataset in PVN3D/pvn3d/common.py
Write your dataset preprocess script following PVN3D/pvn3d/datasets/ycb/ycb_dataset.py (for multi objects of a scene) or PVN3D/pvn3d/datasets/linemod/linemod_dataset.py (for single object of a scene). Note that you should modify or call the function that get your model info, such as 3D keypoints, center points, and radius properly.
(Important!) Visualize and check if you process the data properly, eg, the projected keypoints and center point, the semantic label of each point, etc. For example, you can visualize the projected center point (blue point) and selected keypoints (red points) as follow by running python3 -m datasets.linemod.linemod_dataset.
For inference, make sure that you load the 3D keypoints, center point, and radius of your objects in the object coordinate system properly in PVN3D/pvn3d/lib/utils/pvn3d_eval_utils.py.

Citations

Please cite PVN3D if you use this repository in your publications:

@InProceedings{He_2020_CVPR,
author = {He, Yisheng and Sun, Wei and Huang, Haibin and Liu, Jianran and Fan, Haoqiang and Sun, Jian},
title = {PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}