Awesome
RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization
Yan Xu, Kwan-Yee Lin, Guofeng Zhang, Xiaogang Wang, Hongsheng Li.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
1. Framework
The basic pipeline of our proposed RNNPose. (a) Before refinement, a reference image is rendered according to the object initial pose (shown in a fused view). (b) Our RNN-based framework recurrently refines the object pose based on the estimated correspondence field between the reference and target images. The pose is optimized to be consistent with the reliable correspondence estimations highlighted by the similarity score map (built from learned 3D-2D descriptors) via differentiable LM optimization. (c) The output refined pose.
<!-- ![image info](./demo/framework.png) --> <p align="center"> <img src="./demo/idea.png" alt="alt text" width="450"/> </p>2. Pose Estimation with Occlusions and Erroneous Pose Initializations
Estimated Poses and Intermediate System Outputs from Different Recurrent Iterations.
<p align="center"> <img src="demo/ape_short_small.gif" alt="animated" height=400/><img src="demo/driller_short_small.gif" alt="animated" height=400/> </p>Pose Estimates with Erroneous Pose Initializations
Visualization of our pose estimations (first row) on Occlusion LINEMOD dataset and the similarity score maps (second row) for downweighting unreliable correspondences during pose optimization. For pose visualization, the white boxes represent the erroneous initial poses, the red boxes are estimated by our algorithm and the ground-truth boxes are in blue. Here, the initial poses for pose refinement are originally from PVNet but added with significant disturbances for robustness testing.
<center class="half"> <img src="./demo/est_vis.png" height=200 > </center>3. Installation
Install the Docker
A dockerfile is provided to help with the environment setup. You need to install docker and nvidia-docker2 first and then set up the docker image and start up a container with the following commands:
cd RNNPose/docker
sudo docker build -t rnnpose .
sudo docker run -it --runtime=nvidia --ipc=host --volume="HOST_VOLUME_YOU_WANT_TO_MAP:DOCKER_VOLUME" -e DISPLAY=$DISPLAY -e QT_X11_NO_MITSHM=1 rnnpose bash
If you are not familiar with docker, you could also install the dependencies manually following the provided dockerfile.
Compile the Dependencies
cd RNNPose/scripts
bash compile_3rdparty.sh
4. Data Preparation
We follow DeepIM and PVNet to preprocess the training data for LINEMOD. You could follow the steps here for data preparation.
5. Test with the Pretrained Models
We train our model with the mixture of the real data and the synthetic data on LINEMOD dataset.
<!-- and evaluate the trained models on the test set of LINEMOD and LINEMOD OCCLUSION datasets. -->The trained models on the LINEMOD dataset have been uploaded to the OneDrive. You can download them and put them into the directory weight/ for testing.
An example bash script is provided below for reference.
export PYTHONPATH="$PROJECT_ROOT_PATH:$PYTHONPATH"
export PYTHONPATH="$PROJECT_ROOT_PATH/thirdparty:$PYTHONPATH"
seq=cat
gpu=1
start_gpu_id=0
mkdir $model_dir
train_file=/home/yxu/Projects/Works/RNNPose_release/tools/eval.py
config_path=/mnt/workspace/Works/RNNPose_release/config/linemod/"$seq"_fw0.5.yml
pretrain=$PROJECT_ROOT_PATH/weights/trained_models/"$seq".tckpt
python -u $train_file multi_proc_train \
--config_path $config_path \
--model_dir $model_dir/results \
--use_dist True \
--dist_port 10000 \
--gpus_per_node $gpu \
--optim_eval True \
--use_apex True \
--world_size $gpu \
--start_gpu_id $start_gpu_id \
--pretrained_path $pretrain
Note that you need to specify the PROJECT_ROOT_PATH, i.e. the absolute directory of the project folder RNNPose and modify the respective data paths in the configuration files to the locations of downloaded data before executing the commands. You could also refer to the commands below for evaluation with our provide scripts.
Evaluation on LINEMOD
bash scripts/eval.sh
Evaluation on LINEMOD OCCLUSION
bash scripts/eval_lmocc.sh
Training from Scratch
An example training script is provided.
bash scripts/train.sh
6. Citation
If you find our code useful, please cite our paper.
@inproceedings{xu2022rnnpose,
title={RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization},
author={Xu, Yan and Kwan-Yee Lin and Zhang, Guofeng and Wang, Xiaogang and Li, Hongsheng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2022}
}
@article{xu2024rnnpose,
title={Rnnpose: 6-dof object pose estimation via recurrent correspondence field estimation and pose optimization},
author={Xu, Yan and Lin, Kwan-Yee and Zhang, Guofeng and Wang, Xiaogang and Li, Hongsheng},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2024},
publisher={IEEE}
7. Acknowledgement
The skeleton of this code is borrowed from RSLO. We also would like to thank the public codebases PVNet, RAFT, SuperGlue and DeepV2D.
<!-- ## TODO List and ETA - [x] Inference code and pretrained models (25/12/2021) - [ ] Training code - [ ] Code cleaning and improvement -->