Home

Awesome

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond Awesome arXiv

<div style="text-align:center; font-size: 18px;"> <p> <a href="https://scholar.google.com/citations?user=NmwjI0AAAAAJ&hl=en">Zheng Zhu*</a>, <a href="https://scholar.google.com.hk/citations?user=5IJ0Yg4AAAAJ&hl=zh-CN">Xiaofeng Wang*</a>, <a href="https://scholar.google.co.jp/citations?user=aocj89kAAAAJ&hl=es">Wangbo Zhao*</a>, <a href="https://scholar.google.com/citations?user=pE9gTMQAAAAJ&hl=zh-CN">Chen Min*</a>, <a href="https://scholar.google.com/citations?user=AGPz8C4AAAAJ">Nianchen Deng*</a>, <a href="https://scholar.google.com.hk/citations?hl=zh-CN&user=w9fTWKQAAAAJ">Min Dou*</a>, <a href="https://scholar.google.com/citations?user=35UcX9sAAAAJ&hl=en">Yuqi Wang*</a>, <a href="https://scholar.google.com.hk/citations?hl=zh-CN&user=K0PpvLkAAAAJ">Botian Shi<sup>#</sup></a>, <a href="https://scholar.google.com/citations?user=i2II0XIAAAAJ&hl=en">Kai Wang<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=aTA2wL4AAAAJ&hl=en">Chi Zhang<sup>#</sup</a>, <a href="https://scholar.google.com.hk/citations?hl=zh-CN&user=jF4dPZwAAAAJ">Yang You<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=qxWfV6cAAAAJ&hl=en">Zhaoxiang Zhang<sup>#</sup</a>, <a href="">Dawei Zhao<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=hvxSnzoAAAAJ&hl=zh-CN">Liang Xiao<sup>#</sup</a>, <a href="https://scholar.google.com.sg/citations?hl=en&user=zdhRJCkAAAAJ&view_op=list_works&gmla=AJsN-F4PURIx5GMQHVpprJJBjTsNC62YCHjxGsKOwVhrkZ1aJsLgBiuKPBbAgbdcE5_KNw3OnLQgOVSjlqmS6gc-6ti0M2K5o-klHgoOywFCbdaaGnpis130zvgoZFJkVfmoNKpo8Krp">Jian Zhao<sup>#</sup</a>, <a href="https://scholar.google.com/citations?user=TN8uDQoAAAAJ&hl=en">Jiwen Lu<sup>#</sup</a>, <a href="">Guan Huang<sup>#</sup</a> </div> <div style="text-align:center; font-size: 18px;"> <p> (* denotes equal contributions, <sup>#</sup> denotes corresponding authors) </div> <p align="center"> <img src="asset/videogen.gif" width="320px"/> <img src="asset/videogen2.gif" width="320px"/> </p> <p align="center"> <img src="asset/drivedreamer.gif" width="190px"/> <img src="asset/drive-wm.gif" width="450px"/> </p> <p align="center"> <img src="asset/drive.gif" width="640px"/> </p> <p align="center"> <img src="asset/unisim.gif" width="240px"/> <img src="asset/unipi.gif" width="245px"/> <img src="asset/robodreamer.gif" width="145px"/> </p> <p align="center"> (Source:<a href="https://openai.com/sora">Sora</a>, <a href="https://drivedreamer.github.io/">DriveDreamer</a>, <a href="https://drivedreamer2.github.io/">DriveDreamer-2</a>, <a href="https://drive-wm.github.io/">Drive-WM</a>, <a href="https://universal-simulator.github.io/unisim/">UniSim</a>, <a href="https://universal-policy.github.io/unipi/">UniPi</a>, <a href="https://robovideo.github.io/">RoboDreamer</a>) </p> <!-- - [News] <span style="color:red;"> **We are planning to update the survey soon to encompass the latest work. If you have any suggestions, please feel free to contact us.**</span> - [News] The Chinese translation is available on [Zhihu](https://zhuanlan.zhihu.com/p/661860981). Special thanks to [Dai-Wenxun](https://github.com/Dai-Wenxun) for this. -->

This is the official repository for the technical report:

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond.

📌 Introduction

In our report, we present a holistic examination of recent advancements in world model research, encompassing profound philosophical perspectives and detailed discussions. Our analysis delves deeply into the literature surrounding world models for video generation, autonomous driving, and autonomous agents, uncovering their applications in media production, artistic expression, end-to-end driving, games, and robots. We assess the existing challenges and limitations of world models and delve into prospective avenues for future research, with the intention of steering and igniting further progress in world models.

Framework

Papers and Toolboxes for Video Generation World Models

VideoGen

MethodsTaskGithub
Open-Sora-PlanT2V GenerationStar
Open-SoraT2V GenerationStar
SoraT2V Generation & Editing-
IRC-GANT2V Generation-
TGANs-CT2V Generation-
TFGANsT2V Generation-
StoryGANT2V GenerationStar
TiVGANT2V Generation-
GODIVAT2V GenerationStar
VideoGPTC2V GenerationStar
StoryDALL-EC2V Generation-
CogVideoT2V GenerationStar
Imagen VideoT2V Generation-
MAGViTC2V GenerationStar
MAGViT-V2C2V GenerationStar
VideoPoetT2V Generation-
SVDT2V GenerationStar
WorldDreamerT2V GenerationStar
LatteT2V GenerationStar
StreamingT2VT2V GenerationStar

Papers and Toolboxes for Autonomous Driving World Models

Drive

MethodsTaskGithub
Iso-DreamEnd-to-end Driving-
MILEEnd-to-end DrivingStar
SEM2End-to-end Driving-
TrafficBotsEnd-to-end Driving-
Think2DriveEnd-to-end Driving-
GAIA-1Neural Driving Simulator (2D)-
TeslaNeural Driving Simulator-
DriveDreamerNeural Driving Simulator (2D)Star
ADriver-INeural Driving Simulator (2D)-
DrivingDiffusionNeural Driving Simulator (2D)-
PanaceaNeural Driving Simulator (2D)Star
Drive-WMNeural Driving Simulator (2D) & End-to-end DrivingStar
WoVoGenNeural Driving Simulator (2D)-
DriveDreamer-2Neural Driving Simulator (2D)Star
GenADNeural Driving Simulator (2D)Star
SubjectDriveNeural Driving Simulator (2D)-
Copilot4DNeural Driving Simulator (3D)-
OccWorldNeural Driving Simulator (3D)Star
MUVONeural Driving Simulator (3D)-
LidarDMNeural Driving Simulator (3D)-
UniWorldNeural Driving Simulator (3D) & 4D Pre-training-
ViDARNeural Driving Simulator (3D) & 4D Pre-trainingStar
DriveWorldNeural Driving Simulator (3D) & 4D Pre-training-

Papers and Toolboxes for Autonomous Agents World Models

Agent

MethodsTaskGithub
PlaNetRoboticsStar
World ModelsGame AgentStar
RobotDreamPolicyRobotics-
Plan2ExploreRoboticsStar
DreamerV1RoboticsStar
SimPLeGame AgentStar
DreamingRobotics-
DreamerV2Game AgentStar
LEXARoboticsStar
PathDreamerIndoor NavigationStar
DreamerProRoboticsStar
DreamingV2Robotics-
TransDreamerGame Agent & RoboticsStar
IRISGame AgentStar
JEPAFramework-
Dr.GRoboticsStar
SWIMRobotics-
DreamerV3Game Agent & RoboticsStar
HarmonyDreamGame Agent & Robotics-
DayDreamerRoboticsStar
TWMGame AgentStar
STORMGame AgentStar
MC-JEPAOptics Flow Prediction-
A-JEPAAudio Classification-
I_JEPAImage SemanticsStar
SafeDreamerRoboticsStar
GenieGenerative Interactive Environment-
V-JEPAVideo PredictionStar
RoboDreamerRobotics-
UniSimGenerative Interactive Environment-

Contact

If you find our survey is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{generalworldmodelsurvey,
  title={Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond},
  author={Zheng Zhu and Xiaofeng Wang and Wangbo Zhao and Chen Min and Nianchen Deng and Min Dou and Yuqi Wang and Botian Shi and Kai Wang and Chi Zhang and Yang You and Zhaoxiang Zhang and Dawei Zhao and Liang Xiao and Jian Zhao and Jiwen Lu and Guan Huang}, 
  journal={arXiv preprint arXiv:2405.03520},
  year={2024}
}