Home

Awesome

<br /> <p align="center"> <h2 align="center">Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders</h1> <p align="center"> <strong>Jie Cheng</strong><sup>1</sup>&nbsp;&nbsp;&nbsp; <strong>Xiaodong Mei</strong><sup>1</sup>&nbsp;&nbsp;&nbsp; <strong>Ming Liu</strong><sup>1,2</sup>&nbsp;&nbsp;&nbsp; <br /> <strong>HKUST</strong><sup>1</sup>&nbsp;&nbsp;&nbsp; <strong>HKUST(GZ)</strong><sup>1,2</sup>&nbsp;&nbsp;&nbsp; </p> <p align="center"> <a href="https://iccv2023.thecvf.com/"> <img src="https://img.shields.io/badge/ICCV-2023-blue?style=flat"> </a> <a href='https://arxiv.org/pdf/2308.09882.pdf' style='padding-left: 0.5rem;'> <img src='https://img.shields.io/badge/arXiv-PDF-red?style=flat&logo=arXiv&logoColor=wihte' alt='arXiv PDF'> </a> <a href='https://hkustconnect-my.sharepoint.com/:b:/g/personal/jchengai_connect_ust_hk/ERCySXLweLFDgv7Ejouf-lgB1_cq4K1spcnS-bkSL2OxPA?e=DwINqh' style='padding-left: 0.5rem;'> <img src='https://img.shields.io/badge/Poster-royalblue?style=flat&logo=Shotcut&logoColor=wihte' alt='arXiv PDF'> </a> </p> <p align="center"> <img src="misc/framework.png" align="center" width="100%"> </p>

Highlight

Getting Started

Setup Environment

1. Clone this repository:

git clone https://github.com/jchengai/forecast-mae.git
cd forecast-mae

2. Setup conda environment:

conda create -n forecast_mae python=3.8
conda activate forecast_mae
sh ./scripts/setup.sh

3. Setup Argoverse 2 Motion Forecasting Dataset, the expected data structure should be:

data_root
    ├── train
    │   ├── 0000b0f9-99f9-4a1f-a231-5be9e4c523f7
    │   ├── 0000b6ab-e100-4f6b-aee8-b520b57c0530
    │   ├── ...
    ├── val
    │   ├── 00010486-9a07-48ae-b493-cf4545855937
    │   ├── 00062a32-8d6d-4449-9948-6fedac67bfcd
    │   ├── ...
    ├── test
    │   ├── 0000b329-f890-4c2b-93f2-7e2413d4ca5b
    │   ├── 0008c251-e9b0-4708-b762-b15cb6effc27
    │   ├── ...

Preprocess

(recommend) By default, we use ray and 16 cpu cores for preprocessing. It will take about 30 minutes to finish.

Single-agent

python3 preprocess.py --data_root=/path/to/data_root -p

Multi-agent

python3 preprocess.py --data_root=/path/to/data_root -m -p

or you can disable parallel preprocessing by removing -p.

Training

1. Pre-training + fine-tuning (single-agent)

phase 1 - pre-training:

python3 train.py data_root=/path/to/data_root model=model_mae gpus=4 batch_size=32

phase 2 - fine-tuning:

(Note that quotes in 'pretrained_weights="/path/to/pretrain_ckpt"' are necessary)

python3 train.py data_root=/path/to/data_root model=model_forecast gpus=4 batch_size=32 monitor=val_minFDE 'pretrained_weights="/path/to/pretrain_ckpt"'

2. Training from scratch (single-agent)

python3 train.py data_root=/path/to/data_root model=model_forecast gpus=4 batch_size=32 monitor=val_minFDE

3. Multi-agent motion forecasting

We also provide a simple multi-agent motion forecasting baseline using Forecast-MAE's backbone model.

python train.py data_root=/path/to/data_root model=model_forecast_mutliagent gpus=4 batch_size=32 monitor=val_AvgMinFDE

Evaluation

Single-agent

Evaluate on the validation set

python3 eval.py model=model_forecast data_root=/path/to/data_root batch_size=64 'checkpoint="/path/to/checkpoint"'

Generate submission file for the AV2 multi-agent motion forecasting benchmark

python3 eval.py model=model_forecast data_root=/path/to/data_root batch_size=64 'checkpoint="/path/to/checkpoint"' test=true

Multi-agent

Evaluate on the validation set

python3 eval.py model=model_forecast_multiagent data_root=/path/to/data_root batch_size=64 'checkpoint="/path/to/checkpoint"'

Generate submission file for the AV2 multi-agent motion forecasting benchmark

python3 eval.py model=model_forecast_multiagent data_root=/path/to/data_root batch_size=64 'checkpoint="/path/to/checkpoint"' test=true

Results and checkpoints

MAE-pretrained_weights: download.

A visualization notebook of the mae reconstruction result can be found here.

For this repository, the expected performance on Argoverse 2 validation set is:

Single-agent

ModelsminADE1minFDE1minADE6minFDE6MR6
Forecast-MAE (scratch)1.8024.5290.72141.4300.187
Forecast-MAE (fine-tune)1.7444.3760.71171.4080.178

Multi-agent

ModelsAvgMinADE6AvgMinFDE6ActorMR6
Multiagent-Baseline0.7171.640.194

You can download the checkpoints with the corresponding link.

Qualitative Results

demo

Acknowledgements

This repo benefits from MAE, Point-BERT, Point-MAE, NATTEN and HiVT. Thanks for their great works.

Citation

If you found this repository useful, please consider citing our work:

@article{cheng2023forecast,
  title={{Forecast-MAE}: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders},
  author={Cheng, Jie and Mei, Xiaodong and Liu, Ming},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2023}
}