Awesome
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
<div style="text-align:center; font-size: 18px;"> <p> <a href="https://scholar.google.com/citations?user=NmwjI0AAAAAJ&hl=en">Zheng Zhu*</a>, <a href="https://scholar.google.com.hk/citations?user=5IJ0Yg4AAAAJ&hl=zh-CN">Xiaofeng Wang*</a>, <a href="https://scholar.google.co.jp/citations?user=aocj89kAAAAJ&hl=es">Wangbo Zhao*</a>, <a href="https://scholar.google.com/citations?user=pE9gTMQAAAAJ&hl=zh-CN">Chen Min*</a>, <a href="https://scholar.google.com/citations?user=AGPz8C4AAAAJ">Nianchen Deng*</a>, <a href="https://scholar.google.com.hk/citations?hl=zh-CN&user=w9fTWKQAAAAJ">Min Dou*</a>, <a href="https://scholar.google.com/citations?user=35UcX9sAAAAJ&hl=en">Yuqi Wang*</a>, <a href="https://scholar.google.com.hk/citations?hl=zh-CN&user=K0PpvLkAAAAJ">Botian Shi<sup>#</sup></a>, <a href="https://scholar.google.com/citations?user=i2II0XIAAAAJ&hl=en">Kai Wang<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=aTA2wL4AAAAJ&hl=en">Chi Zhang<sup>#</sup</a>, <a href="https://scholar.google.com.hk/citations?hl=zh-CN&user=jF4dPZwAAAAJ">Yang You<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=qxWfV6cAAAAJ&hl=en">Zhaoxiang Zhang<sup>#</sup</a>, <a href="">Dawei Zhao<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=hvxSnzoAAAAJ&hl=zh-CN">Liang Xiao<sup>#</sup</a>, <a href="https://scholar.google.com.sg/citations?hl=en&user=zdhRJCkAAAAJ&view_op=list_works&gmla=AJsN-F4PURIx5GMQHVpprJJBjTsNC62YCHjxGsKOwVhrkZ1aJsLgBiuKPBbAgbdcE5_KNw3OnLQgOVSjlqmS6gc-6ti0M2K5o-klHgoOywFCbdaaGnpis130zvgoZFJkVfmoNKpo8Krp">Jian Zhao<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=TN8uDQoAAAAJ&hl=en">Jiwen Lu<sup>#</sup</a>, <a href="">Guan Huang<sup>#</sup</a> </div> <div style="text-align:center; font-size: 18px;"> <p> (* denotes equal contributions, <sup>#</sup> denotes corresponding authors) </div> <p align="center"> <img src="asset/videogen.gif" width="320px"/> <img src="asset/videogen2.gif" width="320px"/> </p> <p align="center"> <img src="asset/drivedreamer.gif" width="190px"/> <img src="asset/drive-wm.gif" width="450px"/> </p> <p align="center"> <img src="asset/drive.gif" width="640px"/> </p> <p align="center"> <img src="asset/unisim.gif" width="240px"/> <img src="asset/unipi.gif" width="245px"/> <img src="asset/robodreamer.gif" width="145px"/> </p> <p align="center"> (Source:<a href="https://openai.com/sora">Sora</a>, <a href="https://drivedreamer.github.io/">DriveDreamer</a>, <a href="https://drivedreamer2.github.io/">DriveDreamer-2</a>, <a href="https://drive-wm.github.io/">Drive-WM</a>, <a href="https://universal-simulator.github.io/unisim/">UniSim</a>, <a href="https://universal-policy.github.io/unipi/">UniPi</a>, <a href="https://robovideo.github.io/">RoboDreamer</a>) </p> <!-- - [News] <span style="color:red;"> **We are planning to update the survey soon to encompass the latest work. If you have any suggestions, please feel free to contact us.**</span> - [News] The Chinese translation is available on [Zhihu](https://zhuanlan.zhihu.com/p/661860981). Special thanks to [Dai-Wenxun](https://github.com/Dai-Wenxun) for this. -->This is the official repository for the technical report:
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond.
📌 Introduction
In our report, we present a holistic examination of recent advancements in world model research, encompassing profound philosophical perspectives and detailed discussions. Our analysis delves deeply into the literature surrounding world models for video generation, autonomous driving, and autonomous agents, uncovering their applications in media production, artistic expression, end-to-end driving, games, and robots. We assess the existing challenges and limitations of world models and delve into prospective avenues for future research, with the intention of steering and igniting further progress in world models.
Papers and Toolboxes for Video Generation World Models
Methods | Task | Github |
---|---|---|
Open-Sora-Plan | T2V Generation | |
Open-Sora | T2V Generation | |
Sora | T2V Generation & Editing | - |
IRC-GAN | T2V Generation | - |
TGANs-C | T2V Generation | - |
TFGANs | T2V Generation | - |
StoryGAN | T2V Generation | |
TiVGAN | T2V Generation | - |
GODIVA | T2V Generation | |
VideoGPT | C2V Generation | |
StoryDALL-E | C2V Generation | - |
CogVideo | T2V Generation | |
Imagen Video | T2V Generation | - |
MAGViT | C2V Generation | |
MAGViT-V2 | C2V Generation | |
VideoPoet | T2V Generation | - |
SVD | T2V Generation | |
WorldDreamer | T2V Generation | |
Latte | T2V Generation | |
StreamingT2V | T2V Generation |
Papers and Toolboxes for Autonomous Driving World Models
Methods | Task | Github |
---|---|---|
Iso-Dream | End-to-end Driving | - |
MILE | End-to-end Driving | |
SEM2 | End-to-end Driving | - |
TrafficBots | End-to-end Driving | - |
Think2Drive | End-to-end Driving | - |
GAIA-1 | Neural Driving Simulator (2D) | - |
Tesla | Neural Driving Simulator | - |
DriveDreamer | Neural Driving Simulator (2D) | |
ADriver-I | Neural Driving Simulator (2D) | - |
DrivingDiffusion | Neural Driving Simulator (2D) | - |
Panacea | Neural Driving Simulator (2D) | |
Drive-WM | Neural Driving Simulator (2D) & End-to-end Driving | |
WoVoGen | Neural Driving Simulator (2D) | - |
DriveDreamer-2 | Neural Driving Simulator (2D) | |
GenAD | Neural Driving Simulator (2D) | |
SubjectDrive | Neural Driving Simulator (2D) | - |
Copilot4D | Neural Driving Simulator (3D) | - |
OccWorld | Neural Driving Simulator (3D) | |
MUVO | Neural Driving Simulator (3D) | - |
LidarDM | Neural Driving Simulator (3D) | - |
UniWorld | Neural Driving Simulator (3D) & 4D Pre-training | - |
ViDAR | Neural Driving Simulator (3D) & 4D Pre-training | |
DriveWorld | Neural Driving Simulator (3D) & 4D Pre-training | - |
Papers and Toolboxes for Autonomous Agents World Models
Methods | Task | Github |
---|---|---|
PlaNet | Robotics | |
World Models | Game Agent | |
RobotDreamPolicy | Robotics | - |
Plan2Explore | Robotics | |
DreamerV1 | Robotics | |
SimPLe | Game Agent | |
Dreaming | Robotics | - |
DreamerV2 | Game Agent | |
LEXA | Robotics | |
PathDreamer | Indoor Navigation | |
DreamerPro | Robotics | |
DreamingV2 | Robotics | - |
TransDreamer | Game Agent & Robotics | |
IRIS | Game Agent | |
JEPA | Framework | - |
Dr.G | Robotics | |
SWIM | Robotics | - |
DreamerV3 | Game Agent & Robotics | |
HarmonyDream | Game Agent & Robotics | - |
DayDreamer | Robotics | |
TWM | Game Agent | |
STORM | Game Agent | |
MC-JEPA | Optics Flow Prediction | - |
A-JEPA | Audio Classification | - |
I_JEPA | Image Semantics | |
SafeDreamer | Robotics | |
Genie | Generative Interactive Environment | - |
V-JEPA | Video Prediction | |
RoboDreamer | Robotics | - |
UniSim | Generative Interactive Environment | - |
Contact
If you find our survey is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.
@article{generalworldmodelsurvey,
title={Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond},
author={Zheng Zhu and Xiaofeng Wang and Wangbo Zhao and Chen Min and Nianchen Deng and Min Dou and Yuqi Wang and Botian Shi and Kai Wang and Chi Zhang and Yang You and Zhaoxiang Zhang and Dawei Zhao and Liang Xiao and Jian Zhao and Jiwen Lu and Guan Huang},
journal={arXiv preprint arXiv:2405.03520},
year={2024}
}