Awesome

Awesome-World-Models

This repository is a collection of research papers on World Models. It aims to provide a useful resource for those interested in this field.

World Models are a class of models in the field of artificial intelligence that aim to create a simplified, internal representation of the external world. These models are designed to predict the future state of the environment based on current observations and past experiences, allowing an agent to make informed decisions.

World Model Papers

Learning to Model the World with Language. arxiv 2023. paper

Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan.

language helps agents predict the future
Unifying (Machine) Vision via Counterfactual World Modeling. arxiv 2023. paper

Bear, Daniel M., Kevin Feigelis, Honglin Chen, Wanhee Lee, Rahul Venkatesh, Klemen Kotar, Alex Durango, and Daniel LK Yamins.
World Models NIPS 2018. paper demo

Ha, David, and Jürgen Schmidhuber.
A Control-Centric Benchmark for Video Prediction. ICLR 2023. paper

Tian, Stephen, Chelsea Finn, and Jiajun Wu.
Transformers are sample efficient world models. ICLR 2023. paper Micheli, Vincent, Eloi Alonso, and François Fleuret.
Towards Efficient World Models ICML 2023 Workshops. paper Eloi Alonso, Vincent Micheli, and François Fleuret.
Learning latent dynamics for planning from pixels. PMLR 2019. paper Hafner, Danijar, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson.

Video Model Papers

MAGVIT: Masked Generative Video Transformer. CVPR 2023. paper demo code

Yu, Lijun, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann et al.

3d VQ + MaskGIT = 37fps on v100 sampling
Diffusion Models for Video Prediction and Infilling. TMLR 2022. paper code

Tobias Höppe, Arash Mehrjou, Stefan Bauer, Didrik Nielsen, Andrea Dittadi
Unsupervised Learning for Physical Interaction through Video Prediction Neurips 2016. paper

Finn, Chelsea, Ian Goodfellow, and Sergey Levine.
Unsupervised Learning of Video Representations using Lstms. ICML 2015. paper Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhudinov.

Action Model Papers

Decision Transformer: Reinforcement Learning via Sequence Modeling Neurips 2021. paper

Chen, Lili, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch.
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion RSS 2023. paper demo Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran