Home

Awesome

Separable Convolutions for Optimizing 3D Stereo Networks

This repo contains the code for "Separable Convolutions for Optimizing 3D Stereo Networks" paper (IEEE ICIP 2021) by Rafia Rahim, Faranak Shamsafar and Andreas Zell. [arXiv] [project] [code] [poster]

<div align="center"> <img src="images/fwsc.png" alt="drawing" width="37%"/> <img src="images/fdwsc.png" alt="drawing" width="50%"/> <p> <b> FwSC (left) and FDwSC (right) </b></p> </div>

Contents

Introduction

In this work we empirically show that 3D convolutions in stereo networks act as a major bottleneck. We propose a set of "plug-&-run" separable convolutions to reduce the computational load of 3D convolutions in stereo networks.

Example

How to use: One can simply plugin in our provided convolutions operators as replacements of 3D convolutions. Here we provide examples with detail computational costs.

from conv_libs.separable_convolutions import FwSC, FDwSC

in_channel=32
out_channel=4
kernel_size=3

fwsc=  FwSC(in_channels=in_channel, out_channels=out_channel, kernel_size=kernel_size)

fdwsc=  FDwSC(in_channels=in_channel, out_channels=out_channel, kernel_size=kernel_size)

For the following sample input

in_channel=32
out_channel=4
input_res=(in_channel, 48, 240, 528) 
kernel_size= (3,3,3)

computational complexity results are:

For 3D convolution: flops=21.02 GMac and params=3.46 k

For Feature-wise separable convolution (FwSC) : flops=6.03 GMac and params=992, flops are 3.49x less than 3D conv

For Feature and Dispisparity-wise separable convolution (FwSC) : flops=3.11 GMac and params=512, flops are 6.76x less than 3D conv

For more details please refer to this file.

Experimentation Setup

1. Conda environment setup:

You can use the yaml files provided in networks sub-folders to setup the corresponding conda environments.

For GANet use this yml file and run following commands

conda env create -f env_separable_convs_GANet.yml
conda activate ganet

To work with PSMNet use this file and setup as follows:

conda env create -f env_separable_convs_PSMNet.yml
conda activate psmnet

2. Dataset Preparation:

Our dataset preparation code has been adapted from the baseline methods including GANet and PSMNet. To best visualize the folder hierarchy of datasets please refer to DATA.md.

3. Train:

Training / fine-tuning scripts for separable convolutions based networks can be found in following files:

For example to train GANet11 model with FwSC we use following script

model=GANet11 #GANet11 or GANet_deep
conv_type=FwSC

cmd="train.py --data_path=/data2/rahim/data/
        --crop_height=240
        --crop_width=528
        --model=$model
        --convolution_type=$conv_type
        --nEpochs=15
        --training_list=./lists/sceneflow_train.list
        --val_list=./lists/sceneflow_val.list
        --max_disp=192
        --kitti2015=0"
python $cmd

4. Evaluate:

logdirs=FwSC_GANet11_kitti2015
resume=FwSC_GANet11_sceneflow_finetuned_kitti15.pth
cmd="evaluate.py --crop_height=384
            --crop_width=1248
            --max_disp=192
            --data_path=/data/rahim/data/Kitti_2015/training/
            --test_list=lists/kitti2015_val.list
            --save_path=./evaluation-results/${logdirs}
            --kitti2015=1
            --kitti=0
            --resume=./checkpoint/FwSC/$resume
            --model=GANet11
            --max_test_images=10
            --convolution_type=FwSC"
echo $cmd
python -W ignore $cmd >> ./logs/evaluate/${logdirs}.txt

Full scripts for evaluation can be found here:

5. Predict:

Prediction script generates the results to upload on KITTI benchmark for evaluation.

Pre-trained Models

We provide pretrained models with different configurations. Please download pre-trained models and place in the folders ./checkpoint/FwSC/ or ./checkpoint/FDwSC/ accordingly.

GANet

<table> <thead> <tr> <th>Convolution Type</th> <th>Sceneflow Models</th> <th>Fintuned Models (kitti2015)</th> </tr> </thead> <tbody> <tr> <td rowspan=2>FwSC</td> <td><a href="https://drive.google.com/file/d/1WctFUDCzs0IWHkNFMi3QGwgBpBTaFVZZ/view?usp=sharing">FwSC_GANet11_sceneflow</a></td> <td><a href="https://drive.google.com/file/d/1WDpWLP2G5Z4YZPcXVphJArOpThWtub_L/view?usp=sharing">FwSC_GANet11_kitti2015</a></td> </tr> <tr> <td><a href="https://drive.google.com/file/d/16srWDodZCJJT1mvvvN-YEhVlV0JJd8St/view?usp=sharing">FwSC_GANetdeep_sceneflow</a></td> <td><a href="https://drive.google.com/file/d/1S_-cMtqJRmUQdT4cgdrO-Mvo7f-zXrmb/view?usp=sharing">FwSC_GANetdeep_kitti2015</a></td> </tr> <tr> <td rowspan=2>FDwSC</td> <td><a href="https://drive.google.com/file/d/1od-m9cSsM7dlyd7l7G_Zc5o3aiC7ExEe/view?usp=sharing"> FDwSC_GANet11_sceneflow</a> </td> <td><a href="https://drive.google.com/file/d/1CehEcmeYZ17okh1_IRY1Lz4Xjp606soe/view?usp=sharing">FDwSC_GANet11_kitti2015</a></td> </tr> <tr> <td><a href="https://drive.google.com/file/d/1GqNlfEVIHxR0aK0Ki0dvbBh9QABsYna-/view?usp=sharing"> FDwSC_GANetdeep_sceneflow</a></td> <td><a href="https://drive.google.com/file/d/17y3Fe-ICsUHGco0fKDrO1UlYjeZnbvR3/view?usp=sharing">FDwSC_GANetdeep_kitti2015</a></td> </tr> </tbody> </table> <!-- |Convolution Type |Sceneflow Models| Fintuned Models (kitti2015)| |---|---|---| |-|[FwSC_GANet11_sceneflow](https://drive.google.com/file/d/1WctFUDCzs0IWHkNFMi3QGwgBpBTaFVZZ/view?usp=sharing)| [FwSC_GANet11_kitti2015](https://drive.google.com/file/d/1WDpWLP2G5Z4YZPcXVphJArOpThWtub_L/view?usp=sharing)| |-|[FwSC_GANetdeep_sceneflow](https://drive.google.com/file/d/16srWDodZCJJT1mvvvN-YEhVlV0JJd8St/view?usp=sharing)|[FwSC_GANetdeep_kitti2015](https://drive.google.com/file/d/1S_-cMtqJRmUQdT4cgdrO-Mvo7f-zXrmb/view?usp=sharing)| |-|[FDwSC_GANet11_sceneflow](https://drive.google.com/file/d/1od-m9cSsM7dlyd7l7G_Zc5o3aiC7ExEe/view?usp=sharing)| [FDwSC_GANet11_kitti2015](https://drive.google.com/file/d/1CehEcmeYZ17okh1_IRY1Lz4Xjp606soe/view?usp=sharing)| |-|[FDwSC_GANetdeep_sceneflow](https://drive.google.com/file/d/1GqNlfEVIHxR0aK0Ki0dvbBh9QABsYna-/view?usp=sharing)|[FDwSC_GANetdeep_kitti2015](https://drive.google.com/file/d/17y3Fe-ICsUHGco0fKDrO1UlYjeZnbvR3/view?usp=sharing)| -->

PSMNet

Convolution TypeSceneflow ModelsFine-tuned Models (kitti2015)
FwSCFwSC_PSMNet_sceneflowFwSC_PSMNet_kitti2015
FDwSCFDwSC_PSMNet_sceneflowFDwSC_PSMNet_kitti2015

Results

<!-- ![Comparison](images/Comparison_v3.png) --> <p align="center"> <img src="images/Comparison_v3.png" alt="drawing" width="100%"/> </p> <p align="center"> <img src="images/table.png" alt="drawing" width="80%"/> </p> <p align="center"> <img src="images/qualitative-results-v2.png" alt="drawing" width="100%"/> </p> <p align="center"> <b> KITTI2015 results (left) and Sceneflow results (right) </b></p>

Credits

This code is implemented based on GANet and PSMNet. Special thanks to authors of DenseMatchingBenchmark for providing evaluation and visualization codes. We also want to thank authors of ptflop counter for computational complexity code.

Reference

If you find the code useful, please cite our paper:

@inproceedings{rahim2021separable,
    title={Separable Convolutions for Optimizing 3D Stereo Networks},
    author={Rahim, Rafia and Shamsafar, Faranak and Zell, Andreas},
    booktitle={2021 IEEE International Conference on Image Processing (ICIP)},
    pages={3208--3212},
    year={2021},
    organization={IEEE}
}