Home

Awesome

A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision

This repository is for our paper "A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision", Lanxiao Li and Michael Heizmann, ECCV 2022 (Arxiv).

Requirements

Hardware

The training scripts with _ddp expect distributed training with multiple GPUs. But some samples for single GPU training is also provided.

Software

The repo is tested under Ubuntu 18.04 and 20.04. CUDA-toolkit (tested with 10.2 and 11.1) and GCC is needed to compile some extensions. Also, following python packages are required:

pytorch     # tested with 1.8. other versions should work as well
torchvision
open3d
matplotlib
scipy
pybind11
opencv-python
pillow
MinkowskiEngine=0.5.4

To install MinkowskiEngine, please follow the official repo.

Preparation

Extensions

To compile C++-extensions, go to cpp_ext/fps and cpp/knn and run

bash build.sh

in each folder.

To compile CUDA-extension (PointNet++), go to model/pointnet2 and run

python setup.py install

Data

To prepare the pre-training data:

Usage

Pretraining

To pretrain a PointNet++ and a depth map based CNN (DPCo), use

python train_dp_moco_ddp.py \
--lr 0.03 \
--save log/DPCo \
--batch-size 64 \
--cos \
--local \
--moco \
--worker 8 \
--epochs 120 \
--dist-url 'tcp://localhost:10001' \
--multiprocessing-distributed \
--world-size 1 \
--rank 0

Our training is done on a single node with 2 NVIDIA Tesla V100 GPUs. You might have to update some parameters (e.g. workers, batch-size, world-size) according to you own hardware. Also, the code for single GPU without DDP is provided in train_ddp_moco.py. But we only use this version for debugging purpose.

Similarly, to pretrain a sparse 3D CNN and depth map based CNN (DVCo with color), use

export OMP_NUM_THREADS=12 # make MinkowskiEngine happy
python train_dv_ddp.py \
--lr 0.03 \
--save log/DVCo \
--batch-size 64 \
--cos \
--moco \
--local \
--worker 8 \
--epochs 120 \
--dist-url 'tcp://localhost:10001' \
--multiprocessing-distributed \
--world-size 1 \
--rank 0

Finetuning

For finetuning on 3D object detection task, please follow README.md in downstream.

Note

We are still working on cleaning our internal code base and testing with this public repo. There would be updates in the future.

Known Issues

Citation

If you find this repo helpful, please consider cite our work

@inproceedings{li2022invar3d,
    author = {Li, Lanxiao and Heizmann, Michael},
    title = {A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision},
    booktitle = {ECCV},
    year = {2022}
}

Acknowledgement

This repo has modified some code from following repos. We thank the authors for their amazing code bases. Please consider star/cite their works as well.