Awesome
Doe-1: Closed-Loop Autonomous Driving with Large World Model
Paper | Project Page | Code
Check out our Large Driving Model Series!
Doe-1: Closed-Loop Autonomous Driving with Large World Model
Wenzhao Zheng* $\dagger$, Zetian Xia*, Yuanhui Huang, Sicheng Zuo, Jie Zhou, Jiwen Lu
* Equal contribution $\dagger$ Project leader
Doe-1 is the first closed-loop autonomous driving model for unified perception, prediction, and planning.
News
- [2024/12/13] Evaluation code released.
- [2024/12/13] Paper released on arXiv.
- [2024/12/13] Demo released.
Demo
Doe-1 is a unified model to accomplish visual-question answering, future prediction, and motion planning.
Overview
We formulate autonomous driving as a unified next-token generation problem and use observation, description, and action tokens to represent each scene. Without additional fine-tuning, Doe-1 accomplishes various tasks by using different input prompts, including visual question-answering, controlled image generation, and end-to-end motion planning.
Closed-Loop Autonomous Driving
We explore a new closed-loop autonomous driving paradigm which combines end-to-end model and world model to construct a closed loop.
Visualizations
Closed-Loop Autonomous Driving
Action-Conditioned Video Generation
Getting Started
Data Preparation
-
Download nuScenes V1.0 full dataset data HERE.
-
Download the annotations data_nusc from OmniDrive and unzip it.
-
Download the VQVAE weights from HERE and put them to the following directory as HERE:
Doe/
- model/
- lumina_mgpt/
- ckpts/
- chameleon/
- tokenizer/
- text_tokenizer.json
- vqgan.yaml
- vqgan.ckpt
- xllmx/
- ...
Inference
<!-- We provide the following checkpoints: -->- Generate the conversation data for inference and set the max :
# max length: 1 for qa, 5 for planning
python dataset/gen_data.py \
--info_path path/to/infos_var.pkl \
--qa_path path/to/OmniDriveDataset \
--nusc_path path/to/nuscenes \
--save_path path/to/save/outputs \
--max_length 1
- Inference with a model ckpt:
# set split and id for multi gpus
CUDA_VISIBLE_DIVICES=0 python inference/eval.py \
--anno_path path/to/val_infos.pkl \
--nusc_path path/to/nuscenes \
--save_path path/to/save/output \
--model_path path/to/model/ckpt \
--data_path path/to/generated/data.json \
--task qa
Related Projects
Our code is based on the excellent work Lumina-mGPT.
Citation
If you find this project helpful, please consider citing the following paper:
@article{doe,
title={Doe-1: Closed-Loop Autonomous Driving with Large World Model},
author={Zheng, Wenzhao and Xia, Zetian and Huang, Yuanhui and Zuo, Sicheng and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv: 2412.09627},
year={2024}
}