Home

Awesome

WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language

Project Page

This repository is for WildRefer dataset and official implement for WildRefer: WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language.

Dataset

Our dataset can be download here.

We strongly recommend to use our pre-processed HuCenLife and STCrowd that can be downloaded here.

How to use this code

Data Preparation

Please prepare the dataset as following folder struction:

./
└── data/
    ├── liferefer_test.json
    ├── liferefer_train.json
    ├── strefer_test.json        
    └── strefer_train.json    
└── src/      
    ├── LifeRefer.zip
    └── STRefer.zip

Unzip our processed data

cd src
unzip LifeRefer.zip
unzip STRefer.zip
cd ..

Environment Installation

Our environment is based on Python 3.8 and cuda 11.3. You can install the environment with conda.

conda create -n wildrefer_env python=3.8 -y
conda activate wildrefer_env
conda install conda-forge::cudatoolkit-dev=11.3 -y
pip install torch==1.11.0 torchvision==0.12.0 --index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
python -m spacy download en_core_web_sm
cd pointnet2
python setup.py install
cd ..

Test

Our weights can be downloaded here. You can put the weights under the folder weights/.

./
└── weights/
    ├── liferefer_test.json       
    └── strefer_train.json    

STRefer

python test.py --dataset strefer --pretrain weights/strefer_weights.pth --max_lang_num 50 --frame_num 2 --batch_size 36 

LifeRefer

python test.py --dataset liferefer --pretrain weights/liferefer_weights.pth --frame_num 2 --batch_size 32

Train

STRefer

python train.py --dataset strefer --max_lang_num 50

LifeRefer

python train.py --dataset liferefer --max_lang_num 100

License:

All datasets are published under the Creative Commons Attribution-NonCommercial-ShareAlike. This means that you must attribute the work in the manner specified by the authors, you may not use this work for commercial purposes and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same license.