Awesome
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
Project Page | Paper | Data
Pytorch implementation of our autoregressive model formulation for 3D bounding-box estimation & detection.
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
YuXuan Liu<sup>1,2</sup>,
Nikhil Mishra<sup>1,2</sup>,
Maximilian Sieb<sup>1</sup>,
Yide Shentu<sup>1,2</sup>,
Pieter Abbeel<sup>1,2</sup>,
Xi Chen<sup>1</sup> <br>
<sup>1</sup>Covariant.ai, <sup>2</sup>UC Berkeley
in ECCV 2022
Autoregressive 3D Bounding Box Estimation
3D bounding-box estimation assumes that 2D object segmentation has already been performed through any type of segmentation model, e.g. Mask R-CNN. Our autoregressive bounding box estimation model can be found under autoreg-bbox.
Python dependencies are listed in requirements.txt
and can be installed via pip install -r requirements.txt
We provide two Jupyter notebooks:
- visualize_data.ipynb which lets you visualize data samples from our new dataset
COB-3D
. We provide code to visualize 2D masks and 3D bounding boxes. - inference_example.ipynb which lets you run inference with our newly proposed model architecture for the 3D Bounding Box Estimation task. We provide trained model weights which you can download here. Any use the the dataset, code, and weights is subject to our CC Attribution-NonCommercial-ShareAlike License. <br/><br/>
Autoregressive 3D Bounding Box Detection
3D bounding box detection predicts 3D bounding box directly from a point cloud.
We forked repos from two SOTA methods for the detection task, i.e. FCAF3D and PVRCNN, and implemented our autoregressive head on top. The augmented code can be found under the respective folders autoreg-fcaf3d and autoreg-pvrcnn. <br/><br/>
COB-3D Dataset
You can download our newly published dataset for common objects in bins for robotic picking applications here. Any use the the dataset, code, and weights is subject to our CC Attribution-NonCommercial-ShareAlike License. All of the data was created by Theory Studios.
Each data point contains the following:
- RGB image of shape (H, W, 3)
- Depth map of shape (H, W)
- Intrinsic Matrix of the camera (3, 3)
- Normals Map of shape (H, W, 3)
- Instance Masks of shape (N, H, W) where N is the number of objects
- Amodal Instance masks of shape (N, H, W) which includes the occluded regions of the object
- 3D Bounding Box of each object (N, 9) as determined by dimensions, center, and rotation.
For more info and example code on how to load & interact with the data, refer to the visualize_data.ipynb Jupyter notebook.
<img src='imgs/rgb.png' height=200/> <img src='imgs/segm.png' height=200/> <img src='imgs/bbox3d.gif' height=200/>License
This work, including the paper, code, weights, and dataset, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.