Awesome
ClearPose
This is the official repository of paper 'ClearPose: Large-scale Transparent Object Dataset and Benchmark' in ECCV 2022. (ArXiv, Video)
Dataset
DropBox link
ClearPose dataset is captured using RealSense L515 camera in indoor environments over 63 transparent objects. It contains RGB, raw depth, ground truth depth (generated by overlapping rendered objects' depth over raw depth), ground truth surface normal (calculated from ground truth depth) images, and all the object instance 6D poses. The data annotation is accomplished with the tool ProgressLabeler, which includes ORB-SLAM3 to solve camera trajectory, a Blender interface for object poses, and backend renderer to generate data. The objects' geometry models are manually created in Blender and verified during the annotation process. We also include models of opaque objects (from YCB and HOPE datasets) appeared in some scenes. The object names are defined as shown in the image below (also provided in the blender scene file in object model zip folder).
ClearPose is separated into 9 sets. Set1 includes chemical transparent objects only, Set2-7 include household objects only, and Set8-9 also include other adversarial factors. Among Set2-7, Set2 and Set3 includes almost twice the objects as Set4-7, appeared to have heavy clutters. Objects in Set2 are divided into Set4 and Set5, and objects in Set3 are divided into Set6 and Set7.
Each set includes 5-6 scenes. For Set1-7, the difference is only on backgrounds of transparent objects. We used one scene as validation/testing set and others as training set. For Set8 and Set9, we include 4 types of adversarial conditions, each with 3 scenes (for Set8 we have scene1-6, for Set9 we have scene7-12):
With opaque distractor objects: Set8_scene1-3;<br> With colored liquid inside containers: Set8_scene6, Set9_scene9,10;<br> Non-planar cases (Set1-7 are all captured on flat tabletop): Set8_scene5, Set9_scene11,12;<br> With a translucent box cover: Set8_scene4, Set9_scene7,8.
The folder structure is as follows:
<dataset_path>
|-- set1
|-- scene1
|-- metadata.mat #
|-- 000000-color.png # RGB image
|-- 000000-depth.png # Raw depth image
|-- 000000-depth_true.png # Ground truth depth image
|-- 000000-label.png #
|-- 000000-normal_true.png #
...
|-- model
|-- <object1>
|-- <object1>.obj
|-- <object2>
|-- <object2>.obj
...
The metadata.mat
file contains the annotations for each scene. For every single frame in the scene, it includes the following data:<br>
cls_indexes: object ID, n*1 matrix (n = number of visible objects)
camera_intrinsics: [[fx, 0, cx], [0, fy, cy], [0, 0, 1]], 3*3 matrix
rotation_translation_matrix: camera pose matrix [R|t], 3*4 matrix
center: n*2
factor_depth: 1000
bbox: n*4
poses: 3*4*n object pose matrix as [Rotation matrix|translation vector]
Benchmark experiments
We benchmarked two vision tasks, single image depth completion and object pose estimation, using end-to-end deep networks trained on the ClearPose dataset. For depth completion, we benchmarked ImplicitDepth and TransCG. For object pose estimation, we benchmarked Xu et al. (this method is not open-source and we implemented it based on the original paper) and FFB6D. FFB6D is an RGB-D based pose estimation method, and we compare its performance with raw, completed depth from TransCG, and ground truth depth.
As different deep networks might have different working python environments, we separate them to different branches. For each of them, most of network training and inference source code is the same as their original repository, while we added customized dataloader and evaluation code for our dataset. To reproduce and develop based on our code, please refer to README in specific branches.
Training/Test Dataset Split in paper
training set:
set1: scene1-4, set4-7: scene1-5
test set:
heavy occlusion: set2, set3: all scenes
new background: set1: scene5, set4-7: scene6
with opaque objects: set8: scene1,2,3
translucent cover: set8: scene4, set9: scene7,8
non planar: set8: scene5, set9: scene11,12
filled with liquid: set8: scene6, set9: scene9,10
Citation
If you find this project relevant for your work, please consider citing the paper.
@inproceedings{chen2022clearpose,
title={ClearPose: Large-scale Transparent Object Dataset and Benchmark},
author={Chen, Xiaotong and Zhang, Huijie and Yu, Zeren and Opipari, Anthony and Jenkins, Odest Chadwicke},
booktitle={European Conference on Computer Vision},
year={2022}
}
Frequently Asked question
-
Materials for objects
<details><summary>object material table</summary>
</details>object name material beaker_1 glass dropper_1 plastic dropper_2 plastic flask_1 glass funnel_1 plastic graduated_cylinder_1 glass graduated_cylinder_2 plastic pan_1 plastic pan_2 plastic pan_3 glass reagent_bottle_1 glass reagent_bottle_2 plastic stick_1 glass syringe_1 plastic bottle_1 glass bottle_2 glass bottle_3 glass bottle_4 glass bottle_5 glass bowl_1 glass bowl_2 glass bowl_3 glass bowl_4 glass bowl_5 glass bowl_6 glass container_1 glass container_2 glass container_3 glass container_4 glass container_5 glass fork_1 plastic knife_1 plastic knife_2 plastic mug_1 glass mug_2 glass pitcher_1 plastic plate_1 glass plate_2 glass spoon_1 plastic spoon_2 plastic water_cup_1 glass water_cup_3 plastic water_cup_4 glass water_cup_5 glass water_cup_6 glass water_cup_7 glass water_cup_8 glass water_cup_9 glass water_cup_10 glass water_cup_11 glass water_cup_12 glass water_cup_13 plastic water_cup_14 plastic wine_cup_1 glass wine_cup_2 glass wine_cup_3 glass wine_cup_4 glass wine_cup_5 glass wine_cup_6 glass wine_cup_7 plastic wine_cup_8 plastic wine_cup_9 glass
License
Licensed under MIT License