Home

Awesome

IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation (ICCV2023)

This is the official implementation of IST-Net. IST-Net is a clean, simple, and prior-free category-level pose estimator.

<p align="center"> IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation (ICCV2023) <br /> <a href="https://arxiv.org/abs/2303.13479">[Paper]</a>, <a href="https://sites.google.com/view/cvmi-ist-net/">[Project Page]</a> <br /> <a href="https://sites.google.com/view/jianhuiliu/home">Jianhui Liu</a>, <a href="https://yukangchen.com/">Yukang Chen</a>, <a href="https://shuluoshu.github.io/">Xiaoqing Ye</a>, <a href="https://xjqi.github.io/">Xiaojuan Qi</a> </p> <p align="center"> <img src="docs/main_fig.png" width="100%"> </p>

Getting startted

Prepare the environment

conda create -n istnet python=3.6
conda activate istnet
# The code is tested on pytorch1.10 & CUDA11.3, please choose the properate vesion of torch
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
# Dependent packages
pip install gorilla-core==0.2.5.3
pip install gpustat==1.0.0
pip install opencv-python-headless
pip install matplotlib
pip install scipy

Compiling

# Clone this repo
git clone https://github.com/CVMI-Lab/IST-Net.git
# Compile pointnet2
cd model/pointnet2
python setup.py install

Prepare the datasets

Following DPDN, please Download the folloing data NOCS (camera_train, camera_test, camera_composed_depths, real_train, real_test, ground truths, and mesh models) and segmentation results (Link). For camera_train.pkl, camera_val.pkl, real_test.pkl, real_rain.pkl, please refer to this Link. Then unzip them in data folder and arange them as follows:

data
├── CAMERA
│   ├── train
│   └── val
├── camera_full_depths
│   ├── train
│   └── val
├── Real
│   ├── train
│   └── test
├── gts
│   ├── val
│   └── real_test
├── obj_models
│   ├── train
│   ├── val
│   ├── real_train
│   ├── real_test
│   ├── camera_train.pkl
│   ├── camera_val.pkl
│   ├── real_train.pkl
│   └── real_test.pkl
├── segmentation_results
    ├── train_trainedwoMask
    ├── test_trainedwoMask
    └── test_trainedwithMask

Date processing

python data_processing.py

Training from scartch

# gpus refers to the ids of gpu. For single gpu, please set it as 0
python train.py --gpus 0,1 --config config/ist_net_default.yaml

Training in seperate manner

If you want to achieve a higher result, we recommand you to train IST-Net in two phase. Phase 1, train the world-space enhancer(WE). Phase 2, freeze the world-space enhancer and train other component from scartch.

# Phase 1
python train.py --gpus 0,1 --config config/posenet_gt_default.yaml
# Phase 2, modify the [world_enhancer_path] in yaml file with the model weights saved in phase 1
python train.py --gpus 0,1 --config config/ist_net_freeze_world_enhancer.yaml

Evaluation

python test.py --config config/ist_net_default.yaml

Experimental results

If you want to get the same results reported in our paper. You can download the weights below, and modify the test_path in the yaml file. (Eg. the weights are stored in /.../log/test_istnet/epoch_30.pth. Then fill the test_path in yaml file with /.../log/test_istnet/)

IoU50IoU755 degree 2 cm5 degree 5 cm10 degree 2 cm10 degree 5 cm10 degree 10 cmPre-trained
IST-Net82.576.647.553.472.180.582.6Weights
IST-Net freeze WE83.879.247.554.670.380.282.3Weights

Citation

If you find this project useful in your research, please consider citing:

@article{liu2023prior,
  title={Prior-free Category-level Pose Estimation with Implicit Space Transformation},
  author={Liu, Jianhui and Chen, Yukang and Ye, Xiaoqing and Qi, Xiaojuan},
  journal={arXiv preprint arXiv:2303.13479},
  year={2023}
}

Acknowledgement