Home

Awesome

<div align="center">

Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

</div>

arXiv  GitHub license GitHub

<p align="center"> <img src='./src/scene-flow/row-1/gif-1.gif' width="400"> <img src='./src/scene-flow/row-1/gif-2.gif' width="400"> </p>

This is the official repository of the CMFlow, a cross-modal supervised approach for estimating 4D radar scene flow. For technical details, please refer to our paper on CVPR 2023:

Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision <br/> Fangqiang Ding, Andras Palffy, Dariu M. Gavrila, Chris Xiaoxuan Lu <br/> [arXiv] [demo] [page] [supp] [video]

<p align="left"> <img src='./src/openfig.png' width="500"> </p>

News

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Ding_2023_CVPR,
    author    = {Ding, Fangqiang and Palffy, Andras and Gavrila, Dariu M. and Lu, Chris Xiaoxuan},
    title     = {Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {9340-9349}
}

Getting Started

To find out how to run our scene flow experiments, please see our intructions in GETTING_STARTED. If you meet some issues when runinng our code, please raise them under this repository.

Abstract

This work proposes a novel approach to 4D radar-based scene flow estimation via cross-modal learning. Our approach is motivated by the co-located sensing redundancy in modern autonomous vehicles. Such redundancy implicitly provides various forms of supervision cues to the radar scene flow estimation. Specifically, we introduce a multi-task model architecture for the identified cross-modal learning problem and propose loss functions to opportunistically engage scene flow estimation using multiple cross-modal constraints for effective model training. Extensive experiments show the state-of-the-art performance of our method and demonstrate the effectiveness of cross-modal supervised learning to infer more accurate 4D radar scene flow. We also show its usefulness to two subtasks - motion segmentation and ego-motion estimation.

Method

pipeline.jpg
Figure 1. Cross-modal supervised learning pipeline for 4D radar scene flow estimation. Given two consecutive radar point clouds as the input, the model architecture, which is composed of two stages (blue/orange block colours for stage 1/2), outputs the final scene flow together with the motion segmentation and a rigid ego-motion transformation. Cross-modal supervision signals retrieved from co-located modalities are utilized to constrain outputs with various loss functions. This essentially leads to a multi-task learning problem.

Qualitative results

Here are some GIFs to show our qualitative results on scene flow estimation and two subtasks, motion segmentation and ego-motion estimation. For more qualitative results, please refer to our demo video or supplementary.

Scene flow

<p align="center"> <img src='./src/scene-flow/gif-1.gif' width="840"> <img src='./src/scene-flow/gif-2.gif' width="840"> </p>

Subtask - Motion Segmentation

<p align="center"> <img src='./src/motion-seg/gif-1.gif' width="840"> <img src='./src/motion-seg/gif-2.gif' width="840"> </p>

Subtask - Ego-motion Estimation

<p align="center"> <img src='./src/ego-motion/gif-1.gif' width="600"> <img src='./src/ego-motion/gif-2.gif' width="600"> </p>

Demo Video

<p align="center"> <a href="https://youtu.be/PjKgznDizhI"><img src="./src/cover.png" width="80%"></a> </p>