Home

Awesome

CamLiFlow & CamLiRAFT

PWC PWC

This is the official PyTorch implementation for our two papers:

中文解读:https://zhuanlan.zhihu.com/p/616384758

Changes to the Conference Paper

In this extended version, we instantiate a new type of the bidirectional fusion pipeline, the CamLiRAFT based on the recurrent all-pairs field transforms. CamLiRAFT obtains significant performance improvements over the original PWC-based CamLiFlow and sets a new state-of-the-art record on various datasets.

News

Pretrained Weights

ModelTraining setWeightsComments
CamLiRAFTThings (80e)camliraft_things80e.ptBest generalization performance
CamLiRAFTThings (150e)camliraft_things150e.ptBest performance on Things
CamLiRAFTThings (150e) -> KITTI (800e)camliraft_things150e_kitti800e.ptBest performance on KITTI
CamLiRAFT-LThings-Occ (100e)camliraft_l_best_things_occ.ptBest performance on Things-Occ
CamLiRAFT-LThings-Occ (100e)camliraft_l_best_kitti_occ.ptBest generalization performance on KITTI-Occ
CamLiRAFT-LThings-Noc (100e)camliraft_l_best_things_noc.ptBest performance on Things-Noc
CamLiRAFT-LThings-Noc (100e)camliraft_l_best_kitti_noc.ptBest generalization performance on KITTI-Noc

Things-Occ means "occluded FlyingThings3D" and Things-Noc means "non-occluded FlyingThings3D". Same for KITTI-Occ and KITTI-Noc.

Precomputed Results

Here, we provide precomputed results for the submission to the online benchmark of KITTI Scene Flow. * denotes refining the background scene flow with rigid priors.

ModelD1-allD2-allFl-allSF-allLink
CamLiFlow1.81%3.19%4.05%5.62%camliflow-wo-refine.zip
CamLiFlow *1.81%2.95%3.10%4.43%camliflow.zip
CamLiRAFT1.81%3.02%3.43%4.97%camliraft-wo-refine.zip
CamLiRAFT *1.81%2.94%2.96%4.26%camliraft.zip

Environment

Create a PyTorch environment using conda:

conda create -n camliraft python=3.7
conda activate camliraft
conda install pytorch==1.10.2 torchvision==0.11.3 cudatoolkit=11.3 -c pytorch

Install mmcv and mmdet:

pip install openmim
mim install mmcv-full==1.4.0
mim install mmdet==2.14.0

Install other dependencies:

pip install opencv-python open3d tensorboard hydra-core==1.1.0

Compile CUDA extensions for faster training and evaluation:

cd models/csrc
python setup.py build_ext --inplace

Download the ResNet-50 pretrained on ImageNet-1k:

wget https://download.pytorch.org/models/resnet50-11ad3fa6.pth
mkdir pretrain
mv resnet50-11ad3fa6.pth pretrain/

NG-RANSAC is also required if you want to evaluate on KITTI. Please follow https://github.com/vislearn/ngransac to install the library.

Demo

Then, run the following script to launch a demo of estimating optical flow and scene flow from a pair of images and point clouds:

python demo.py --model camliraft --weights /path/to/camliraft/checkpoint.pt

Note that CamLiRAFT is not very robust to objects at a greater distance, as the network has only been trained on data with a depth of less than 35m. If you are getting bad results on your own data, try scaling the depth of the point cloud to a range of 5 ~ 35m.

Evaluate CamLiFlow and CamLiRAFT

FlyingThings3D

First, download and preprocess the dataset (see preprocess_flyingthings3d_subset.py for detailed instructions):

python preprocess_flyingthings3d_subset.py --input_dir /mnt/data/flyingthings3d_subset

Then, download the pretrained weights camliraft_things150e.pt and save it to checkpoints/camliraft_things150e.pt.

Now you can reproduce the results in Table 2 (see the extended paper):

python eval_things.py testset=flyingthings3d_subset model=camliraft ckpt.path=checkpoints/camliraft_things150e.pt

KITTI

First, download the following parts:

Unzip them and organize the directory as follows:

datasets/kitti_scene_flow
├── testing
│   ├── calib_cam_to_cam
│   ├── calib_imu_to_velo
│   ├── calib_velo_to_cam
│   ├── disp_ganet
│   ├── flow_occ
│   ├── image_2
│   ├── image_3
│   ├── semantic_ddr
└── training
    ├── calib_cam_to_cam
    ├── calib_imu_to_velo
    ├── calib_velo_to_cam
    ├── disp_ganet
    ├── disp_occ_0
    ├── disp_occ_1
    ├── flow_occ
    ├── image_2
    ├── image_3
    ├── obj_map
    ├── semantic_ddr

Then, download the pretrained weights camliraft_things150e_kitti800e.pt and save it to checkpoints/camliraft_things150e_kitti800e.pt.

To reproduce the results without leveraging rigid-body assumptions (SF-all: 4.97%):

python kitti_submission.py testset=kitti model=camliraft ckpt.path=checkpoints/camliraft_things150e_kitti800e.pt

To reproduce the results with rigid background refinement (SF-all: 4.26%), you need to further refine the background scene flow:

python refine_background.py

Results are saved to submission/testing. The initial non-rigid estimations are indicated by the _initial suffix.

Sintel

First, download the flow dataset from: http://sintel.is.tue.mpg.de and the depth dataset from https://sintel-depth.csail.mit.edu/landing

Unzip them and organize the directory as follows:

datasets/sintel
├── depth
│   ├── README_depth.txt
│   ├── sdk
│   └── training
└── flow
    ├── bundler
    ├── flow_code
    ├── README.txt
    ├── test
    └── training

Then, download the pretrained weights camliraft_things80e.pt and save it to checkpoints/camliraft_things80e.pt.

Now you can reproduce the results in Table 4 (see the extended paper):

python eval_sintel.py testset=sintel model=camliraft ckpt.path=checkpoints/camliraft_things80e.pt

Evaluate CamLiRAFT-L

FlyingThings3D

There are two different ways of data preprocessing. The first setting is the one proposed by HPLFlowNet, which only keeps non-occluded points during the preprocessing. The second setting, proposed by FlowNet3D, remains the occluded points.

# Non-occluded
python eval_things_noc_sf.py testset=flyingthings3d_subset_hpl model=camlipwc_l ckpt.path=checkpoints/camliraft_l_best_things_noc.pt
# Occluded
python eval_things_occ_sf.py testset=flyingthings3d_subset_flownet3d model=camliraft_l ckpt.path=checkpoints/camliraft_l_best_things_occ.pt

KITTI

Same with FlyingThings3D, there are two different ways of data preprocessing. We report results on both settings.

# Non-occluded
python eval_kitti_noc_sf.py testset=kitti model=camliraft_l ckpt.path=checkpoints/camliraft_l_best_kitti_noc.pt
# Occluded
python eval_kitti_occ_sf.py testset=kitti model=camliraft_l ckpt.path=checkpoints/camliraft_l_best_kitti_occ.pt

Training

FlyingThings3D

You need to preprocess the FlyingThings3D dataset before training (see preprocess_flyingthings3d_subset.py for detailed instructions).

Train CamLiRAFT on FlyingThings3D (150 epochs):

python train.py trainset=flyingthings3d_subset valset=flyingthings3d_subset model=camliraft

The entire training process takes about 3 days on 4x RTX 3090 GPUs.

KITTI

Finetune the model on KITTI using the weights trained on FlyingThings3D:

python train.py trainset=kitti valset=kitti model=camliraft ckpt.path=checkpoints/camliraft_things150e.pt

The entire training process takes about 0.5 days on 4x RTX 3090 GPUs. We use the last checkpoint (800th) to generate the submission.

Citation

If you find them useful in your research, please cite:

@article{liu2023learning,
  title   = {Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion},
  author  = {Haisong Liu and Tao Lu and Yihui Xu and Jia Liu and Limin Wang},
  journal = {arXiv preprint arXiv:2303.12017},
  year    = {2023}
}

@inproceedings{liu2022camliflow,
  title     = {Camliflow: bidirectional camera-lidar fusion for joint optical flow and scene flow estimation},
  author    = {Liu, Haisong and Lu, Tao and Xu, Yihui and Liu, Jia and Li, Wenjie and Chen, Lijun},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages     = {5791--5801},
  year      = {2022}
}