Home

Awesome

Awesome Object Pose Estimation [Paper List]

A repo to summarize resources used in object pose estimation as well as viewpoint estimation.

In the following tables, 3D CAD model is noted as model and 2D pictured object is noted as object.

Contributing

Contributions are welcome. Please see the Table of Content which lists the things included in this repo. If you wish to contribute within these boundaries, feel free to send a PR. If you have suggestions for new sections to be included, please raise an issue and discuss before sending a PR.

Table of Content

Objects in the wild :camera:

In this table, Pix3D provides accurate 2D-3D alignment while others provide a coarse alignment.

PASCAL3D+ is the de facto benchmark used for viewpoint estimation.

DatasetSample imageAnnotationStatisticsReference
Objectron3D Bounding Box15K annotated videos and 4M annotated imagesCVPR 2021
ApolloCar3D6D Pose + Mask34 car models with 60K+ objects in 5,277 imagesCVPR 2019
Pix3D6D Pose + Mask9 categories containing 395 models in 10,069 imagesCVPR 2018
ObjectNet3DEuler Angles + BoundingBox100 categories with 201,888 objects in 90,127 imagesECCV 2016
PASCAL3D+Euler Angles + BoundingBox12 categories with 36,292 objects in 30,889 imagesWACV 2014
KITTI3D BoundingBox80,256 objects in 14,999 imagesCVPR 2012

Objects in the controlled environments :movie_camera:

This table lists the datasets commonly known as BOP: Benchmark 6D Object Pose Estimation, which provide accurate 3D object models and accurate 2D-3D alignment.

You can download all the BOP datasets here and use the toolkit provided by the organizers.

After downloading the data, you can use our code data/BOP/ply2obj.py to convert original .ply files to .obj files, and run data/BOP/create_annotation.py to create a single annotation file for all the scenes in a dataset.

Datasets format can be found here, we use instance id in our annotation to indicate different instances pictured in the same image.

DatasetSample imageAnnotationStatisticsReference
GraspNet-1B6D pose + Depth + Mask88 models in 190 videos with 97,280 framesCVPR 2020
NOCS6D pose + Depth + NOCS6 categories in 300K composited images and 8K real imagesCVPR 2019
YCBInEOAT6D pose + RGBD Video5 models in 9 videos with 7449 frames with moving objectsIROS 2020
YCB-Video6D Pose + Depth + Mask21 models in 92 videos with 133,827 framesRSS 2018
T-LESS6D Pose + Depth30 models in 20 videos with ~49K framesWACV 2017
Doumanoglou6D Pose + Depth2 models in 3 videos with 183 framesCVPR 2016
Tejani6D Pose + Depth6 models in 6 videos with 2,067 framesECCV 2014
Occluded-LINEMOD6D Pose + Depth8 models in 1,214 frames with 8,992 objectsECCV 2014
LINEMOD6D pose + Depth for one object15 models in 15 videos with 18,273 framesACCV 2012

3D model datasets :bike:

In order to testify the network generalization ability (tested on images containing unseen 3D models from the training set), the following dataset could be used to generate synthetic training data.

Notice that ABC contains generic and arbitrary industrial CAD models while ShapeNetCore and ModelNet contain common category objects such as cars and chairs.

DatasetCategoriesModels in totalReference
ABC-1 millionCVPR 2019
ShapeNetCore55~51,300ArXiv 2015
ModelNet-404012,311CVPR 2015

Rendering methods :mountain_bicyclist:

Differentiable Renderer

Blender Render :

Physical Simulator

PyBullet: a very popular one in the Robotics community.

Others