Awesome
SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving
2024/11/18 16:17: Update model and demo data download link through HuggingFace, Personally I found wget
from HuggingFace link is much faster than Zenodo.
2024/09/26 16:24: All codes already uploaded and tested. You can to try training directly by downloading (through HuggingFace/Zenodo) demo data or pretrained weight for evaluation.
Pre-trained weights for models are available in Zenodo/HuggingFace link. Check usage in 2. Evaluation or 3. Visualization.
Task: Self-Supervised Scene Flow Estimation in Autonomous Driving. No human-label needed. Real-time inference (15-20Hz in RTX3090).
We directly follow our previous work code structure, so you may want to start from the easier one with supervised learning first: Try DeFlow. Then you will find this is simple to you (things about how to train under self-supervised). Here are Scripts quick view in this repo:
-
dataprocess/extract_*.py
: pre-process data before training to speed up the whole training time. [Dataset we included now: Argoverse 2 and Waymo. more on the way: Nuscenes, custom data.] -
process.py
: process data with save dufomap, cluster labels inside file. Only needed once for training. -
train.py
: Train the model and get model checkpoints. Pls remember to check the config. -
eval.py
: Evaluate the model on the validation/test set. And also output the zip file to upload to online leaderboard. -
save.py
: Will save result into h5py file, using [tool/visualization.py] to show results with interactive window.
- SeFlow (Ours ๐): ECCV 2024
- DeFlow (Ours ๐): ICRA 2024
- FastFlow3d: RA-L 2021
- ZeroFlow: ICLR 2024, their pre-trained weight can covert into our format easily through the script.
- NSFP: NeurIPS 2021, faster 3x than original version because of our CUDA speed up, same (slightly better) performance. Done coding, public after review.
- FastNSF: ICCV 2023. Done coding, public after review.
- ... more on the way
๐ก: Want to learn how to add your own network in this structure? Check Contribute section and know more about the code. Fee free to pull request!
0. Setup
Environment: Same to DeFlow. And even lighter here with extracting mmcv module we needed into cuda assets.
git clone --recursive https://github.com/KTH-RPL/SeFlow.git
cd SeFlow && mamba env create -f environment.yaml
CUDA package (need install nvcc compiler), the compile time is around 1-5 minutes:
mamba activate seflow
# CUDA already install in python environment. I also tested others version like 11.3, 11.4, 11.7, 11.8 all works
cd assets/cuda/mmcv && python ./setup.py install && cd ../../..
cd assets/cuda/chamfer3D && python ./setup.py install && cd ../../..
Or you always can choose Docker which isolated environment and free yourself from installation, you can pull it by.
If you have different arch, please build it by yourself cd SeFlow && docker build -t zhangkin/seflow
by going through build-docker-image section.
# option 1: pull from docker hub
docker pull zhangkin/seflow
# run container
docker run -it --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name seflow zhangkin/seflow /bin/zsh
1. Run & Train
Note: Prepare raw data and process train data only needed run once for the task. No need repeat the data process steps till you delete all data. We use wandb to log the training process, and you may want to change all entity="kth-rpl"
to your own entity.
Data Preparation
Check dataprocess/README.md for downloading tips for the raw Argoverse 2 dataset. Or maybe you want to have the mini processed dataset to try the code quickly, We directly provide one scene inside train
and val
. It already converted to .h5
format and processed with the label data.
You can download it from Zenodo/HuggingFace and extract it to the data folder. And then you can skip following steps and directly run the training script.
wget https://huggingface.co/kin-zhang/OpenSceneFlow/resolve/main/demo_data.zip
unzip demo_data.zip -p /home/kin/data/av2
Prepare raw data
Checking more information (step for downloading raw data, storage size, #frame etc) in dataprocess/README.md. Extract all data to unified .h5
format.
[Runtime: Normally need 45 mins finished run following commands totally in setup mentioned in our paper]
python dataprocess/extract_av2.py --av2_type sensor --data_mode train --argo_dir /home/kin/data/av2 --output_dir /home/kin/data/av2/preprocess_v2
python dataprocess/extract_av2.py --av2_type sensor --data_mode val --mask_dir /home/kin/data/av2/3d_scene_flow
python dataprocess/extract_av2.py --av2_type sensor --data_mode test --mask_dir /home/kin/data/av2/3d_scene_flow
Process train data
Process train data for self-supervised learning. Only training data needs this step. [Runtime: Normally need 15 hours for my desktop, 3 hours for the cluster with five available nodes parallel running.]
python process.py --data_dir /home/kin/data/av2/preprocess_v2/sensor/train --scene_range 0,701
Train the model
Train SeFlow needed to specify the loss function, we set the config of our best model in the leaderboard. [Runtime: Around 11 hours in 4x A100 GPUs.]
python train.py model=deflow lr=2e-4 epochs=9 batch_size=16 loss_fn=seflowLoss "add_seloss={chamfer_dis: 1.0, static_flow_loss: 1.0, dynamic_chamfer_dis: 1.0, cluster_based_pc0pc1: 1.0}" "model.target.num_iters=2" "model.val_monitor=val/Dynamic/Mean"
Or you can directly download the pre-trained weight from Zenodo/HuggingFace and skip the training step.
Other Benchmark Models
You can also train the supervised baseline model in our paper with the following command. [Runtime: Around 10 hours in 4x A100 GPUs.]
python train.py model=fastflow3d lr=4e-5 epochs=20 batch_size=16 loss_fn=ff3dLoss
python train.py model=deflow lr=2e-4 epochs=20 batch_size=16 loss_fn=deflowLoss
[!NOTE]
You may found the different settings in the paper that is all methods are enlarge learning rate to 2e-4 and decrease the epochs to 20 for faster converge and better performance. However, we kept the setting on lr=2e-6 and 50 epochs in (SeFlow & DeFlow) paper experiments for the fair comparison with ZeroFlow where we directly use their provided weights. We suggest afterward researchers or users to use the setting here (larger lr and smaller epoch) for faster converge and better performance.
2. Evaluation
You can view Wandb dashboard for the training and evaluation results or upload result to online leaderboard.
Since in training, we save all hyper-parameters and model checkpoints, the only thing you need to do is to specify the checkpoint path. Remember to set the data path correctly also.
# downloaded pre-trained weight, or train by yourself
wget https://huggingface.co/kin-zhang/OpenSceneFlow/resolve/main/seflow_best.ckpt
# it will directly prints all metric
python eval.py checkpoint=/home/kin/seflow_best.ckpt av2_mode=val
# it will output the av2_submit.zip or av2_submit_v2.zip for you to submit to leaderboard
python eval.py checkpoint=/home/kin/seflow_best.ckpt av2_mode=test leaderboard_version=1
python eval.py checkpoint=/home/kin/seflow_best.ckpt av2_mode=test leaderboard_version=2
And the terminal will output the command for you to submit the result to the online leaderboard. You can follow this section for evalai.
Check all detailed result files (presented in our paper Table 1) in this discussion.
3. Visualization
We provide a script to visualize the results of the model also. You can specify the checkpoint path and the data path to visualize the results. The step is quickly similar to evaluation.
python save.py checkpoint=/home/kin/seflow_best.ckpt dataset_path=/home/kin/data/av2/preprocess_v2/sensor/vis
# The output of above command will be like:
Model: DeFlow, Checkpoint from: /home/kin/model_zoo/v2/seflow_best.ckpt
We already write the flow_est into the dataset, please run following commend to visualize the flow. Copy and paste it to your terminal:
python tools/visualization.py --res_name 'seflow_best' --data_dir /home/kin/data/av2/preprocess_v2/sensor/vis
Enjoy! ^v^ ------
# Then run the command in the terminal:
python tools/visualization.py --res_name 'seflow_best' --data_dir /home/kin/data/av2/preprocess_v2/sensor/vis
https://github.com/user-attachments/assets/f031d1a2-2d2f-4947-a01f-834ed1c146e6
Cite & Acknowledgements
@inproceedings{zhang2024seflow,
author={Zhang, Qingwen and Yang, Yi and Li, Peizheng and Andersson, Olov and Jensfelt, Patric},
title={{SeFlow}: A Self-Supervised Scene Flow Method in Autonomous Driving},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024},
pages={353โ369},
organization={Springer},
doi={10.1007/978-3-031-73232-4_20},
}
@inproceedings{zhang2024deflow,
author={Zhang, Qingwen and Yang, Yi and Fang, Heng and Geng, Ruoyu and Jensfelt, Patric},
booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
title={{DeFlow}: Decoder of Scene Flow Network in Autonomous Driving},
year={2024},
pages={2105-2111},
doi={10.1109/ICRA57147.2024.10610278}
}
๐ Thanks to RPL member: Li Ling helps revise our SeFlow manuscript. Thanks to Kyle Vedder, who kindly opened his code (ZeroFlow) including pre-trained weights, and discussed their result with us which helped this work a lot.
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation and Prosense (2020-02963) funded by Vinnova. The computations were enabled by the supercomputing resource Berzelius provided by National Supercomputer Centre at Linkรถping University and the Knut and Alice Wallenberg Foundation, Sweden.
โค๏ธ: DeFlow, BucketedSceneFlowEval