Home

Awesome

FFB6D

This is the official source code for the CVPR2021 Oral work, FFB6D: A Full Flow Biderectional Fusion Network for 6D Pose Estimation. (Arxiv, Video_Bilibili, Video_YouTube)

Table of Content

Introduction & Citation

<div align=center><img width="100%" src="figs/FFB6D_overview.png"/></div>

FFB6D is a general framework for representation learning from a single RGBD image, and we applied it to the 6D pose estimation task by cascading downstream prediction headers for instance semantic segmentation and 3D keypoint voting prediction from PVN3D(Arxiv, Code, Video). At the representation learning stage of FFB6D, we build bidirectional fusion modules in the full flow of the two networks, where fusion is applied to each encoding and decoding layer. In this way, the two networks can leverage local and global complementary information from the other one to obtain better representations. Moreover, at the output representation stage, we designed a simple but effective 3D keypoints selection algorithm considering the texture and geometry information of objects, which simplifies keypoint localization for precise pose estimation.

Please cite FFB6D & PVN3D if you use this repository in your publications:

@InProceedings{He_2021_CVPR,
author = {He, Yisheng and Huang, Haibin and Fan, Haoqiang and Chen, Qifeng and Sun, Jian},
title = {FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}

@InProceedings{He_2020_CVPR,
author = {He, Yisheng and Sun, Wei and Huang, Haibin and Liu, Jianran and Fan, Haoqiang and Sun, Jian},
title = {PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Demo Video

See our demo video on YouTube or bilibili.

Installation

Code Structure

<details> <summary>[Click to expand]</summary> </details>

Datasets

Training and evaluating

Training on the LineMOD Dataset

Evaluating on the LineMOD Dataset

Demo/visualizaion on the LineMOD Dataset

Training on the YCB-Video Dataset

Evaluating on the YCB-Video Dataset

Demo/visualization on the YCB-Video Dataset

Results

<div align=center><img width="50%" src="figs/occlusion.png"/></div>

Adaptation to New Dataset

License

Licensed under the MIT License.