Home

Awesome

This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding".

0. Installation

1. Quick visualization demo

We showing visualization via wandb for superpoints, kps points, bad case analyse, predict/ground_truth masks and box.

self.visualization_superpoint = False
self.visualization_pred = False
self.visualization_gt = False
self.bad_case_visualization = False
self.kps_points_visualization = False
self.bad_case_threshold = 0.15

2. Data preparation

The final required files are as follows:

├── [DATA_ROOT]
│	├── [1] train_v3scans.pkl # Packaged ScanNet training set
│	├── [2] val_v3scans.pkl   # Packaged ScanNet validation set
│	├── [3] ScanRefer/        # ScanRefer utterance data
│	│	│	├── ScanRefer_filtered_train.json
│	│	│	├── ScanRefer_filtered_val.json
│	│	│	└── ...
│	├── [4] ReferIt3D/        # NR3D/SR3D utterance data
│	│	│	├── nr3d.csv
│	│	│	├── sr3d.csv
│	│	│	└── ...
│	├── [5] group_free_pred_bboxes/  # detected boxes (optional)
│	├── [6] gf_detector_l6o256.pth   # pointnet++ checkpoint (optional)
│	├── [7] roberta-base/     # roberta pretrained language model
│	├── [8] checkpoints/      # 3dreftr pretrained models
ScanNetv2
├── data
│   ├── scannetv2
│   │   ├── scans
│   │   ├── scans_test
│   │   ├── train
│   │   ├── val
│   │   ├── test
│   │   ├── val_gt

3. Models

Dataset/ModelREC mAP@0.25RES mIoUModel
ScanRefer/3DRefTR-SP55.4540.76GoogleDrive
ScanRefer/3DRefTR-SP (Single-Stage)54.4340.23GoogleDrive
ScanRefer/3DRefTR-HR55.0441.24GoogleDrive
ScanRefer/3DRefTR-HR (Single-Stage)54.4040.75GoogleDrive
SR3D/3DRefTR-SP68.4544.61GoogleDrive
NR3D/3DRefTR-SP52.5536.17GoogleDrive

4. Training

5. Evaluation

6. Acknowledgements

This repository is built on reusing codes of EDA. We recommend using their code repository in your research and reading the related article. We are also quite grateful for SPFormer, BUTD-DETR, GroupFree, ScanRefer, and SceneGraphParser.

7. Citation

If you find our work useful in your research, please consider citing:

@misc{lin2023unified,
      title={A Unified Framework for 3D Point Cloud Visual Grounding}, 
      author={Haojia Lin and Yongdong Luo and Xiawu Zheng and Lijiang Li and Fei Chao and Taisong Jin and Donghao Luo and Chengjie Wang and Yan Wang and Liujuan Cao},
      year={2023},
      eprint={2308.11887},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}