Home

Awesome

Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners

figure <br /> <br /> Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners <br /> Chun Feng*, Joy Hsu*, Weiyu Liu, Jiajun Wu <br /> In Conference on Computer Vision and Pattern Recognition (CVPR) 2024 <br />

[paper] [project page]

Dataset

Our dataset download process follows the ReferIt3D benchmark.

Specifically, you will need to

Object Predictions

In LARC, we use votenet as our object detector to get object predictions. We pretrained votenet on SR3D dataset. Our object predictions can be found at README-votenet_objects.md.

Setup

Run the following commands to install necessary dependencies.

  conda create -n larc python=3.7.16
  conda activate larc
  pip install -r requirements.txt

Install Jacinle.

  git clone https://github.com/vacancy/Jacinle --recursive
  export PATH=<path_to_jacinle>/bin:$PATH

Install the referit3d python package from ReferIt3D.

  git clone https://github.com/referit3d/referit3d
  cd referit3d
  pip install -e .

Compile CUDA layers for PointNet++.

  cd models/scene_graph/point_net_pp/pointnet2
  python setup.py install

Evaluation

To evaluate LARC:


  scannet=<path_to/keep_all_points_with_global_scan_alignment.pkl>
  referit=<path_to/sr3d_train.csv>
  object=<path_to_object_predictions>
  load_path=<path_to/model_to_evaluate.pth>
  
  jac-run ns3d/trainval.py --desc ns3d/desc_ns3d.py --scannet-file $scannet --referit3D-file $referit --object_data_path $object --load $load_path --evaluate

Weights for our trained LARC model can be found at trained_larc.pth and loaded into load_path.

Training

To train LARC:


  scannet=<path_to/keep_all_points_with_global_scan_alignment.pkl>
  referit=<path_to/sr3d_train.csv>
  object=<path_to_object_predictions>
  
  jac-run ns3d/trainval.py --desc ns3d/desc_ns3d.py --scannet-file $scannet --referit3D-file $referit --object_data_path $object --lr 0.001 --epochs 60 --save-interval 1 --validation-interval 1

Acknowledgements

Our codebase is built on top of NS3D and votenet. Please feel free to email me at fengchun3364@mail.ustc.edu.cn if any problems arise.