Awesome
ZebraPose
The implementation of the paper 'ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation' (CVPR2022). ArXiv
System Requirement
Tested Environment
- Ubuntu 18.04
- CUDA 11.1
- Python 3.6
Main Dependencies:
bop_toolkit
- Pytorch 1.10
- torchvision 0.11.0
- opencv-python
Progressive-X
Download with git clone --recurse-submodules
so that bop_toolkit
will also be cloned.
Training with a dataset in BOP benchmark
Training data preparation
-
Download the dataset from
BOP benchmark
-
Download required ground truth folders of zebrapose from
owncloud
. The folders aremodels_GT_color
,XX_GT
(e.g.train_real_GT
andtest_GT
) andmodels
(models
is optional, only if you want to generate GT from scratch). -
The expected data structure:
. └── BOP ROOT PATH/ ├── lmo ├── ycbv/ │ ├── models │ ├── models_eval │ ├── models_fine │ ├── test │ ├── train_pbr │ ├── train_real │ ├── ... #(other files from BOP page) │ ├── models_GT_color #(from last step) │ ├── train_pbr_GT #(from last step) │ ├── train_real_GT #(from last step) │ ├── test_GT #(from last step) │ ├── train_pbr_GT_v2 #(from last step, for symmetry aware training) │ ├── train_real_GT_v2 #(from last step, for symmetry aware training) │ └── test_GT_v2 #(from last step, for symmetry aware training) └── tless
-
Download the 3
pretrained resnet
, save them underzebrapose/pretrained_backbone/resnet
, and downloadpretrained efficientnet
from "https://download.pytorch.org/models/efficientnet_b4_rwightman-7eb33cd5.pth", save it underzebrapose/pretrained_backbone/efficientnet
-
(Optional) Instead of download the ground truth, you can also generate them from scratch, details in
Generate_GT.md
.
Training
Adjust the paths in the config files, and train the network with train.py
, e.g.
python train.py --cfg config/config_BOP/lmo/exp_lmo_BOP.txt --obj_name ape
The script will save the last 3 checkpoints and the best checkpoint, as well as tensorboard log. To enable sym. aware training, with --sym_aware_training True
Test with trained model
For most datasets, a specific object occurs only once in a test images.
python test.py --cfg config/config_BOP/lmo/exp_lmo_BOP.txt --obj_name ape --ckpt_file path/to/the/best/checkpoint --ignore_bit 0 --eval_output_path path/to/save/the/evaluation/report
To use ICP for refinement, use --use_icp True
For datasets like tless, the number of a a specific object is unknown in the test stage.
python test_vivo.py --cfg config/config_BOP/tless/exp_tless_BOP.txt --ckpt_file path/to/the/best/checkpoint --ignore_bit 0 --obj_name obj01 --eval_output_path path/to/save/the/evaluation/report
To use ICP for refinement, use --use_icp True
Download our trained model from this link
. The ProgressiveX can not set random seed in its python API. The ADD results can be +/- 0.5%.
Evaluate for BOP challange
Merge the .csv
files generated in the last step using tools_for_BOP/merge_csv.py
, e.g.
python merge_csv.py --input_dir /dir/to/pose_result_bop/lmo --output_fn zebrapose_lmo-test.csv
And then evaluate it according to bop_toolkit
Difference between ArXiv v1 and v2
The results were reported with the same checkpoints. We fixed a bug that only influence the inference results:
The PnP solver requires the Bbox size to calculate the 2D pixel location in the original image. We modified the Bbox size in the dataloader. The bug is that we didn't update this modification for the PnP solver. If you remove the get_final_Bbox
in the dataloader, you will get the results reported in v1.
The bug has more influence if we resize the Bbox using crop_square_resize
. After we fixed the bug, we used crop_square_resize
for BOP challange (instead of crop_resize
in the config files in config_paper). We think this resize method should work better since it will not introduce distortion. However, we didn't compare resize methods with experiments.
Acknowledgement
The original code has been developed together with Mahdi Saleh
. Some code are adapted from Pix2Pose
, SingleShotPose
, GDR-Net
, and Deeplabv3
.
Citation
@article{su2022zebrapose,
title={ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation},
author={Su, Yongzhi and Saleh, Mahdi and Fetzer, Torben and Rambach, Jason and Navab, Nassir and Busam, Benjamin and Stricker, Didier and Tombari, Federico},
journal={arXiv preprint arXiv:2203.09418},
year={2022}
}