Home

Awesome

<div align="center">

Deep Learning-Based Object Pose Estimation: A Comprehensive Survey

Jian Liu, Wei Sun, Hui Yang, Zhiwen Zeng, Chongpei Liu,

Jin Zheng, Xingyu Liu, Hossein Rahmani, Nicu Sebe, Ajmal Mian

Introduction | Datasets | Instance-Level | Category-Level | Unseen | Applications

</div>

Note: For any missing or recently published papers, feel free to pull a request, we will add them asap :)

Introduction

This is the official repository of ''Deep Learning-Based Object Pose Estimation: A Comprehensive Survey''. Specifically, we first introduce the datasets used for object pose estimation. Then, we review the instance-level, category-level, and unseen methods, respectively. Finally, we summarize the common applications of this task. The taxonomy of this survey is shown as follows

<p align="center"> <img src="./resources/taxonomy.png" width="100%"> </p>

A comparison of instance-level, category-level, and unseen methods is shown as follows. Instance-level methods can only estimate the pose of specific object instances on which they are trained. Category-level methods can infer intra-class unseen instances rather than being limited to specific instances in the training data. In contrast, unseen object pose estimation methods have stronger generalization ability and can handle object categories not encountered during training.

<p align="center"> <img src="./resources/Fig0.jpg" width="100%"> </p>

Datasets

Chronological overview of the datasets for object pose estimation evaluation. Notably, the pink arrows represent the BOP Challenge datasets, which can be used to evaluate both instance-level and unseen object methods. The red references represent the datasets of articulated objects.

<p align="center"> <img src="./resources/datasets.png" width="100%"> </p>

2.1 Datasets for Instance-Level Methods

<details> <summary>All Datasets</summary> </details>

2.2 Datasets for Category-Level Methods

<details> <summary>2.2.1 Rigid Objects Datasets</summary> </details> <details> <summary>2.2.2 Articulated Objects Datasets</summary> </details>

2.3 Datasets for Unseen Methods

<details> <summary>All Datasets</summary> </details>

Instance-Level

Instance-level object pose estimation describes the task of estimating the pose of the objects that have been seen during the training of the model. We classify existing instance-level methods into four categories: correspondence-based, template-based, voting-based, and regression-based methods. The illustration of these four categories is shown as follows

<p align="center"> <img src="./resources/Fig3.jpg" width="100%"> </p>

3.1 Correspondence-Based Methods

<details> <summary>3.1.1 Sparse Correspondence Methods</summary>

2017

2018

2019

2020

2021

2023

</details> <details> <summary>3.1.2 Dense Correspondence Methods</summary>

2019

2020

2021

2022

2023

2024

</details>

3.2 Template-Based Methods

<details> <summary>3.2.1 RGB-Based Template Methods</summary>

2018

2019

2020

2021

</details> <details> <summary>3.2.2 Point Cloud-Based Template Methods</summary>

2022

2023

2024

</details>

3.3 Voting-Based Methods

<details> <summary>3.3.1 Indirect Voting Methods</summary>

2019

2020

2021

2022

2023

2024

</details> <details> <summary>3.3.2 Direct Voting Methods</summary>

2019

2020

2021

2022

2023

</details>

3.4 Regression-Based Methods

<details> <summary>3.4.1 Geometry-Guided Regression Methods</summary>

2020

2021

2022

2023

2024

</details> <details> <summary>3.4.2 Direct Regression Methods</summary>

2017

2018

2019

2020

2021

2022

2023

2024

</details>

Category-Level

Research on category-level methods has garnered significant attention due to their potential for generalizing to unseen objects within established categories. We classify them into two categories: shape prior-based and shape prior-free methods. The illustration of these two categories is shown as follows

<p align="center"> <img src="./resources/Fig4.jpg" width="100%"> </p>

4.1 Shape Prior-Based Methods

<details> <summary>4.1.1 NOCS Shape Alignment Methods</summary>

2020

2021

2022

2023

2024

</details> <details> <summary>4.1.2 Direct Regress Pose Methods</summary>

2022

2023

2024

</details>

4.2 Shape Prior-Free Methods

<details> <summary>4.2.1 Depth-Guided Geometry-Aware Methods</summary>

2020

2021

2022

2023

2024

</details> <details> <summary>4.2.2 RGBD-Guided Semantic and Geometry Fusion Methods</summary>

2019

2020

2021

2022

2023

2024

</details> <details> <summary>4.2.3 Others</summary>

2020

2021

2022

2023

2024

</details>

Unseen

Unseen object pose estimation methods can generalize to unseen objects without the need for retraining. We classify existing unseen methods into two categories: CAD model-based and manual reference view-based methods. The illustration of these two categories is shown as follows

<p align="center"> <img src="./resources/Fig5.jpg" width="100%"> </p>

5.1 CAD Model-Based Methods

<details> <summary>5.1.1 Feature Matching-Based Methods</summary>

2019

2020

2022

2023

2024

</details> <details> <summary>5.1.2 Template Matching-Based Methods</summary>

2015-2019

2020

2021

2022

2023

2024

</details>

5.2 Manual Reference View-Based Methods

<details> <summary>5.2.1 Feature Matching-Based Methods</summary>

2021

2022

2023

2024

</details> <details> <summary>5.2.2 Template Matching-Based Methods</summary>

2020

2022

2023

2024

</details>

Applications

Chronological overview of some representative applications of object pose estimation methods. The black references, red references, and orange references represent the application of instance-level, category-level, and unseen methods, respectively.

<p align="center"> <img src="./resources/applications.png" width="100%"> </p>

6.1 Robotic Manipulation

<details> <summary>6.1.1 Instance-Level Manipulation</summary> </details> <details> <summary>6.1.2 Category-Level Manipulation</summary> </details> <details> <summary>6.1.3 Unseen Object Manipulation</summary> </details>

6.2 Augmented Reality/Virtual Reality

<details> <summary>All Methods</summary> </details>

6.3 Aerospace

<details> <summary>All Methods</summary> </details>

6.4 Hand-Object Interaction

<details> <summary>All Methods</summary> </details>

6.5 Autonomous Driving

<details> <summary>All Methods</summary> </details>

Citation

If you find the paper useful, please cite our paper.

@article{liu2024survey,
  title={Deep Learning-Based Object Pose Estimation: A Comprehensive Survey},
  author={Liu, Jian and Sun, Wei and Yang, Hui and Zeng, Zhiwen and Liu, Chongpei and Zheng, Jin and Liu, Xingyu and Rahmani, Hossein and Sebe, Nicu and Mian, Ajmal},  
  journal={arXiv preprint arXiv:2405.07801},
  year={2024}
}

Contact

Due to the one-sided nature of our knowledge, if you find any issues or have any suggestions, please feel free to post an issue or contact us via email