Awesome

GS-Pose: Cascaded Framework for Generalizable Segmentation-based 6D Object Pose Estimation

[Project Page]
[Paper]

@inproceedings{cai_2024_GSPose,
    author    = {Cai, Dingding and Heikkil\"a, Janne and Rahtu, Esa},
    title     = {GS-Pose: Cascaded Framework for Generalizable Segmentation-based 6D Object Pose Estimation},
    journal   = {arXiv preprint arXiv:2403.10683},
    year      = {2024},
}

Setup

Please start by installing Miniconda3. This repository contains submodules, and the default environment can be installed as below.

git clone git@github.com:dingdingcai/GSPose.git --recursive
cd GSPose
conda env create -f environment.yml
conda activate gspose

bash install_env.sh

Pre-trained Model

Download the pretrained weights and store it as checkpoints/model_wights.pth.

Demo Example

An example of using GS-Pose for both pose estimation and tracking is provided in notebook.

Datasets

Our evaluation is conducted on the LINEMOD and OnePose-LowTexture datasets.

For comparison with Gen6D, download LINEMOD_Gen6D.
For comparion with OnePose++, download lm and the YOLOv5 detection results lm_yolo_detection.
Download the OnePose-LowTexture dataset and store it under the directory onepose_dataset.

All datasets are organised under the dataspace directory, as below,

dataspace/
├── LINEMOD_Gen6D
│
├── bop_dataset/
│   ├── lm
│   └── lm_yolo_detection
│
├── onepose_dataset/
│   ├── scanned_model
│   └── lowtexture_test_data
│
└── README.md

Evaluation

Evaluation on the subset of LINEMOD (comparison with Gen6D, Cas6D, etc.).

python inference.py --dataset_name LINEMOD_SUBSET --database_dir LMSubSet_database --outpose_dir LMSubSet_pose

Evaluation on all objects of LINEMOD using the built-in detector.

python inference.py --dataset_name LINEMOD --database_dir LM_database --outpose_dir LM_pose

Evaluation on all objects of LINEMOD using the YOLOv5 detection (comparison with OnePose/OnePose++).

python inference.py --dataset_name LINEMOD --database_dir LM_database --outpose_dir LM_yolo_pose

Evaluation on the scanned objects of OnePose-LowTexture.

python inference.py --dataset_name LOWTEXTUREVideo --database_dir LTVideo_database --outpose_dir LTVideo_pose

Training

We utilize a subset (gso_1M) of the MegaPose dataset for training. Please download MegaPose/gso_1M and MegaPose/google_scanned_objects.zip to the directorydataspace, and organize the data as

dataspace/
├── MegaPose/
│   ├── webdatasets/gso_1M
│   └── google_scanned_objects
...

execute the following script under the MegaPose environment for preparing the training data.

python dataset/extract_megapose_to_BOP.py

Then, train the network via

python training/training.py

Acknowledgement

1. The code is partially based on DINOv2, 3D Gaussian Splatting, MegaPose, and SC6D.