Awesome
SeCG:Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention
Environment
Requirements
- CUDA: >=11.3
- Python: >=3.8
- PyTorch: >=1.12.0
Installation
pip install h5py
pip install transformers
pip install pickle
pip install tensorboardX
cd external_tools/pointnet2
python setup.py install
Data Preparation
ScanNet v2
Download the ScanNet V2 dataset.
Prepare for ScanNet data and package it into "scannet_00_views.pkl" or "scannet_0x_views.pkl"
cd data
python prepare_scannet_data.py --process-only-zero-view [1/0]
Pretrained Model
Download Bert files from Hugging Face or from our drive
Download the first encoder checkpoint " ckpt_cls40.pth" from our drive
Evaluation
Download SeCG model "ckpt_nr3d.pth" and "ckpt_sr3d.pth" from our drive, put it into "./checkpoints"
//nr3d
python evaluation.py
--scannet-file ./scannet/scannet_00_views.pkl
--refer_test_file ./data/referit3d/nr3d_test.csv
--weight ./checkpoints/ckpt_nr3d.pth
--bert-pretrain-path /pretrained/bert
//sr3d
python evaluation.py
--scannet-file ./scannet/scannet_00_views.pkl;./scannet/scannet_0x_views.pkl
--refer_test_file ./data/referit3d/sr3d_test.csv
--weight ./checkpoints/ckpt_sr3d.pth
--bert-pretrain-path /pretrained/bert
Training
//nr3d
python train.py
--scannet-file ./scannet/scannet_00_views.pkl
--refer_train_file ./data/referit3d/nr3d_train.csv
--refer_val_file ./data/referit3d/nr3d_test.csv
--pn-path ./pretrained/ckpt_cls40.pth
--n-workers 8
--batch-size 36
--bert-pretrain-path /pretrained/bert
//sr3d
python train.py
--scannet-file ./scannet/scannet_00_views.pkl;./scannet/scannet_0x_views.pkl
--refer_train_file ./data/referit3d/sr3d_train.csv
--refer_val_file ./data/referit3d/sr3d_test.csv
--pn-path ./pretrained/ckpt_cls40.pth
--n-workers 8
--batch-size 36
--bert-pretrain-path /pretrained/bert
Acknowledgment
Our codes references the following codebases. We gratefully thank the authors for their wonderful works.
referit3d, ScanRefer, MVT-3DVG, VQA_ReGAT