Awesome

A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges

Stars Forks <a href='https://arxiv.org/pdf/2403.10249.pdf'><img src='https://img.shields.io/badge/arXiv-2403.10249-b31b1b.svg'></a>

</div>

🏃 Coming soon: Add one-sentence intro to each paper.

[Arxiv] ♦ [PDF]

</div>

🌟 News

✨ [2024/02/06] Creation of this repository to maintain the list of papers on LLM-based agents for game playing. More papers are coming soon!

2024/08

[2024/08/07] Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks. [paper] [code]

2024/07

[2024/07/21] VideoGameBunny: Towards vision assistants for video games. [paper] [code]
[2024/07/05] Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents. [paper]
[2024/07/02] Cradle: Empowering Foundation Agents Towards General Computer Control. [paper] [project]

2024/06

[2024/06/20] Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models. [paper]

2024/05

[2024/05/23] Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration. [paper] [project]
[2024/05/11] Prompt-Gaming: A Pilot Study on LLM-Evaluating Agent in a Meaningful Energy Game. [paper]

2024/04

[2024/04/17] AgentKit: Flow Engineering with Graphs, not Coding. [paper] [code]
[2024/04/16] Self-playing Adversarial Language Game Enhances LLM Reasoning. [paper] [code]

2024/03

[2024/03/23] Evaluate LLMs in real time with Street Fighter III. [code]
[2024/03/19] Embodied LLM Agents Learn to Cooperate in Organized Teams. [paper]
[2024/03/18] Can LLM-Augmented Autonomous Agents Cooperate?, An Evaluation of Their Cooperative Capabilities through Melting Pot. [paper]
[2024/03/18] EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents. [paper]
[2024/03/18] MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control. [paper] [code]
[2024/03/14] Scaling Instructable Agents Across Many Simulated Worlds. [paper]
[2024/03/13] Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation. [paper]
[2024/03/13] SOTOPIA-$\pi$: Interactive Learning of Socially Intelligent Language Agents. [paper] [code]
[2024/03/08] Will GPT-4 Run DOOM? [paper] [code]
[2024/03/05] Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study. [paper] [project]
[2024/03/01] Playing NetHack with LLMs: Potential & Limitations as Zero-Shot Agents. [paper] [code]

2024/02

[2024/02/29] RL-GPT: Integrating Reinforcement Learning and Code-as-policy. [paper]
[2024/02/27] Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization. [paper] [code]
[2024/02/21] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain. [paper] [code]
[2024/02/20] What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents. [paper] [code]
[2024/02/07] S-Agents: Self-organizing Agents in Open-ended Environments. [paper]
[2024/02/04] Enhance Reasoning for Large Language Models in the Game Werewolf. [paper]
[2024/02/02] PokéLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models. [paper] [code]

2024/01

[2024/01/31] SwarmBrain: Embodied agent for real-time strategy game StarCraft II via large language models. [paper]
[2024/01/19] CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents. [paper][code]
[2024/01/17] Searching bug instances in gameplay video repositories. [paper] [data]
[2024/01/04] PokerGPT: An End-to-End Lightweight Solver for Multi-Player Texas Hold'em via Large Language Model. [paper]

2023/12

[2023/12/29] Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game. [paper]
[2023/12/23] LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination. [paper] [project]
[2023/12/19] Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach. [paper]
[2023/12/14] Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft. [paper]
[2023/12/12] MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception. [paper] [code]
[2023/12/08] Apollo's Oracle: Retrieval-Augmented Reasoning in Multi-Agent Debates. [paper] [code]
[2023/12/08] GlitchBench: Can large multimodal models detect video game glitches? [paper] [code]
[2023/12/07] A Framework for Exploring Player Perceptions of LLM-Generated Dialogue in Commercial Video Games. [paper] [website]
[2023/12/05] Creative Agents: Empowering Agents with Imagination for Creative Tasks. [paper] [code]
[2023/12/04] Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games. [paper]
[2023/12/02] Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation. [paper]
[2023/12/01] Deciphering Digital Detectives: Understanding LLM Behaviors and Capabilities in Multi-Agent Mystery Games. [paper]

2023/11

[2023/11/28] War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars. [paper] [code]
[2023/11/26] See and Think: Embodied Agent in Virtual Environment. [paper] [code]
[2023/11/20] DesignGPT: Multi-Agent Collaboration in Design. [paper]
[2023/11/14] MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration. [paper] [code]
[2023/11/10] Jarvis-1: Open-World Multi-Task Agents with Memory-Augmented Multimodal Language Models. [paper] [code]
[2023/11/08] ADaPT: As-Needed Decomposition and Planning with Language Models. [paper] [code]

2023/10

[2023/10/31] Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models. [paper] [code]
[2023/10/29] Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game. [paper]
[2023/10/23] LLM-Based Agent Society Investigation: Collaboration and Confrontation in Avalon Gameplay. [paper]
[2023/10/20] Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds. [paper] [code]
[2023/10/18] SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents. [paper] [code]
[2023/10/16] Character-LLM: A Trainable Agent for Role-Playing. [paper] [code]
[2023/10/13] LLaMA Rider: Spurring Large Language Models to Explore the Open World. [paper]
[2023/10/12] GameGPT: Multi-agent Collaborative Framework for Game Development. [paper]
[2023/10/12] Groot: Learning to Follow Instructions by Watching Gameplay Videos. [paper] [code]
[2023/10/12] Octopus: Embodied Vision-Language Programmer from Environmental Feedback. [paper] [code]
[2023/10/10] Metaagents: Simulating Interactions of Human Behaviors for LLM-Based Task-Oriented Coordination via Collaborative Generative Agents. [paper]
[2023/10/09] Humanoid Agents: Platform for Simulating Human-like Generative Agents. [paper] [code]
[2023/10/08] AvalonBench: Evaluating LLMs Playing the Game of Avalon. [paper] [code]
[2023/10/06] Cautious Curiosity: A Novel Approach to a Human-Like Gameplay Agent. [paper] [code]
[2023/10/05] LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models. [paper] [code]
[2023/10/03] Lyfe Agents: Generative agents for low-cost real-time social interactions. [paper]
[2023/10/03] Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond. [paper] [code]
[2023/10/02] SmartPlay: A Benchmark for LLMs as Intelligent Agents. [paper] [code]
[2023/10/02] Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation. [paper]
[2023/10/01] RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models. [paper] [code]

2023/09

[2023/09/29] AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback. [paper] [code]
[2023/09/29] Motif: Intrinsic Motivation from Artificial Intelligence Feedback. [paper] [code]
[2023/09/29] LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games. [paper] [code]
[2023/09/29] Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4. [paper] [code]
[2023/09/29] Autoagents: A Framework for Automatic Agent Generation. [paper] [code]
[2023/09/21] True Knowledge Comes from Practice: Aligning Large Language Models with Embodied Environments via Reinforcement Learning. [paper] [code]
[2023/09/18] MindAgent: Emergent Gaming Interaction. [paper] [code]
[2023/09/14] Agents: An Open-source Framework for Autonomous Language Agents. [paper] [code]
[2023/09/09] Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf. [paper]

2023/08

[2023/08/23] Are ChatGPT and GPT-4 Good Poker Players?--A Pre-Flop Analysis. [paper]
[2023/08/22] Proagent: Constructing Proactive Cooperative AI Using Large Language Models. [paper] [code]
[2023/08/21] Agentverse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents. [paper] [code]
[2023/08/19] GameEval: Evaluating LLMs on Conversational Games. [paper] [code]
[2023/08/16] Autogen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework. [paper]
[2023/08/15] CALYPSO: LLMs as Dungeon Master's Assistants. [paper]
[2023/08/08] AgentSims: An Open-Source Sandbox for Large Language Model Evaluation. [paper]
[2023/08/01] MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. [paper] [code]

2023/07

[2023/07/24] Tachikuma: Understanding Complex Interactions with Multi-Character and Novel Objects by large Language Models. [paper]
[2023/07/21] Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors. [paper] [project]
[2023/07/12] Sayplan: Grounding Large Language models using 3D Scene Graphs for Scalable Task Planning. [paper]
[2023/07/05] Building Cooperative Embodied Agents Modularly with Large Language Models. [paper] [code]
[2023/07/04] TaPA: Embodied Task Planning with Large Language Models. [paper] [code]

2023/06

[2023/06/20] SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling. [paper] [code]
[2023/06/15] ChessGPT: Bridging Policy Learning and Language Modeling. [paper] [code]
[2023/06/02] OMNI: Open-endedness via Models of human Notions of Interestingness. [paper] [code]
[2023/06/01] STEVE-1: A Generative Model for Text-to-Behavior in Minecraft. [paper] [code]

2023/05

[May-23] COTTAGE: Coherent Text Adventure Games Generation. [paper] [code]
[2023/05/30] AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation. [paper]
[2023/05/26] Playing repeated games with Large Language Models. [paper]
[2023/05/25] Voyager: An Open-Ended Embodied Agent with Large Language Models. [paper] [code]
[2023/05/25] Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory. [paper] [code]
[2023/05/24] SPRING: Studying Papers and Reasoning to Play Games. [paper] [code]
[2023/05/23] Improving Factuality and Reasoning in Language Models through Multiagent Debate. [paper] [code]
[2023/05/17] Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback. [paper] [code]
[2023/05/09] Tidybot: Personalized Robot Assistance with Large Language Models. [paper] [code]
[2023/05/01] ArK: Augmented Reality with Knowledge Interactive Emergent Ability. [paper]

2023/04

[2023/04/07] Generative Agents: Interactive Simulacra of Human Behavior. [paper] [code]
[2023/04/06] Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions. [paper] [code]
[Apr-23] Personalized Quest and Dialogue Generation in Role-Playing Games: A Knowledge Graph- and Language Model-based Approach. [paper] [code]

2023/03

[2023/03/31] CAMEL: Communicative Agents for ''Mind'' Exploration of Large Language Model Society. [paper] [code]
[2023/03/29] Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks. [paper] [code]
[2023/03/06] PaLM-E: An Embodied Multimodal Language Model. [paper]

2023/02

[2023/02/13] Guiding Pretraining in Reinforcement Learning with Large Language Models. [paper] [code]
[2023/02/12] MarioGPT: Open-Ended Text2Level Generation through Large Language Models. [paper] [code]
[2023/02/03] Describe, Explain, Plan and Select: Interactive Planning with LLMs Enables Open-World Multi-Task Agents. [paper] [code]

2023/01

[2023/01/28] Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling. [paper]
[2023/01/21] Open-World Multi-Task Control through Goal-Aware Representation Learning and Adaptive Horizon Prediction. [paper] [code]

2022

[2022/11/22] Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning. [paper]
[2022/11/21] Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models. [paper]
[2022/10/24] Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task. [paper] [code]
[2022/10/05] Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors. [paper] [code]
[2022/08/08] Social Simulacra: Creating Populated Prototypes for Social Computing Systems. [paper]
[2022/07/12] Inner Monologue: Embodied Reasoning through Planning with Language Models. [paper]
[2022/06/23] Video pretraining (VPT): Learning to Act by Watching Unlabeled Online Videos. [paper]
[2022/06/07] Minedojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge. [paper] [code]

Citation

If you find this repository useful, please cite our paper:

@misc{xu2024survey,
      title={A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges},
      author={Xinrun Xu, Yuxin Wang, Chaoyi Xu, Ziluo Ding, Jiechuan Jiang, Zhiming Ding, Börje F. Karlsson},
      year={2024},
      eprint={2403.10249},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Contact

Börje F. Karlsson @tellarin: borje@baai.ac.cn
Xinrun Xu @XinrunXu: xuxinrun20@mails.ucas.ac.cn