Awesome
🐨CoALA: Awesome Language Agents
A compilation of language agents using the Cognitive Architectures for Language Agents (🐨CoALA) framework.
- CoALA Paper (16 pages of main content): https://arxiv.org/abs/2309.02427
- CoALA Tweet (6 threads): https://twitter.com/ShunyuYao12/status/1699396834983362690
- CoALA BibTex file with 300+ related citations: CoALA.bib
- CoALA BibTex citation if you find our work/resources useful:
@misc{sumers2023cognitive,
title={Cognitive Architectures for Language Agents},
author={Theodore Sumers and Shunyu Yao and Karthik Narasimhan and Thomas L. Griffiths},
year={2023},
eprint={2309.02427},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
🐨CoALA Overview
CoALA neatly specifies a language agent starting with its action space, which has 2 parts:
- External actions to interact with external environments (grounding)
- Internal actions to interact with internal memories (reasoning, retrieval, learning)
- A language agent has a short-term working memory and several (optional) long-term memories (episodic for experience, semantic for knowledge, procedural for code/LLM)
- Reasoning = update working memory (with LLM)
- Retrieval = read long-term memory
- Learning = write long-term memory
Then how does a language agent choose which action to take? Its actions are structured into decision making cycles, and each cycle has two stages:
- Planning: The agent applies reasoning/retrieval actions to (iteratively) propose and evaluate actions, then select a learning/grounding action.
- Execution: The selected learning/grounding action is executed to affect the internal memory or external world.
To understand more, read Section 4 of our paper.
Papers
Below is only a subset of papers scraped from CoALA.bib plus pulled requests, with potentially incorrect action space labels. Date is based on arxiv v1. They do not represent all language agent work, and we plan to add more work soon (pull requests welcome), and have labels for highly cited work.
- (2021-10) AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts (reasoning)
- (2021-10) SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark (environment)
- (2022-01) Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (grounding)
- (2022-03) PromptChainer: Chaining Large Language Model Prompts through Visual Programming (grounding)
- (2022-03) ScienceWorld: Is your Agent Smarter than a 5th Grader? (environment)
- (2022-04) Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (grounding)
- (2022-04) Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language (grounding)
- (2022-07) WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents (environment)
- (2022-09) ProgPrompt: Generating Situated Robot Task Plans using Large Language Models (grounding)
- (2022-10) Decomposed Prompting: A Modular Approach for Solving Complex Tasks (reasoning)
- (2022-10) Mind's Eye: Grounded Language Model Reasoning through Simulation (grounding)
- (2022-10) ReAct: Synergizing Reasoning and Acting in Language Models (grounding, reasoning)
- (2022-11) Large Language Models Are Human-Level Prompt Engineers (reasoning)
- (2022-12) LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models (grounding)
- (2022-12) Don’t Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments (grounding)
- (2023-02) Chain of Hindsight Aligns Language Models with Feedback (learning)
- (2023-02) Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents (grounding, reasoning)
- (2023-02) Toolformer: Language Models Can Teach Themselves to Use Tools (grounding)
- (2023-03) Foundation Models for Decision Making: Problems, Methods, and Opportunities (survey)
- (2023-03) HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face (grounding)
- (2023-03) PaLM-E: An Embodied Multimodal Language Model (grounding)
- (2023-03) Reflexion: Language Agents with Verbal Reinforcement Learning (grounding, reasoning, learning)
- (2023-03) Self-Refine: Iterative Refinement with Self-Feedback (reasoning)
- (2023-03) Self-planning Code Generation with Large Language Models (reasoning)
- (2023-04) Generative Agents: Interactive Simulacra of Human Behavior (grounding, reasoning, retrieval, learning)
- (2023-04) Emergent autonomous scientific research capabilities of large language models (grounding, reasoning)
- (2023-04) LLM+P: Empowering Large Language Models with Optimal Planning Proficiency (grounding, reasoning)
- (2023-04) REFINER: Reasoning Feedback on Intermediate Representations (reasoning)
- (2023-04) Teaching Large Language Models to Self-Debug (reasoning)
- (2023-04) GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information (grounding, reasoning)
- (2023-05) CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing (grounding, reasoning, retrieval)
- (2023-05) Augmenting Autotelic Agents with Large Language Models (grounding, reasoning, retrieval, learning)
- (2023-05) ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models (grounding, reasoning)
- (2023-05) ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings (grounding, reasoning)
- (2023-05) Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding (reasoning)
- (2023-05) Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate (grounding, reasoning)
- (2023-05) Improving Factuality and Reasoning in Language Models through Multiagent Debate (grounding, reasoning)
- (2023-05) AdaPlanner: Adaptive Planning from Feedback with Language Models (grounding, retrieval, learning)
- (2023-05) Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models (reasoning)
- (2023-05) ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models (grounding, reasoning)
- (2023-05) SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks (grounding, reasoning)
- (2023-05) Tree of Thoughts: Deliberate Problem Solving with Large Language Models (reasoning)
- (2023-05) Voyager: An Open-Ended Embodied Agent with Large Language Models (grounding, reasoning, retrieval, learning)
- (2023-06) InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback (grounding, reasoning)
- (2023-06) ToolQA: A Dataset for LLM Question Answering with External Tools (grounding)
- (2023-06) Mind2Web: Towards a Generalist Agent for the Web (environment)
- (2023-06) RestGPT: Connecting Large Language Models with Real-World RESTful APIs (grounding, reasoning)
- (2023-06) ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases (grounding, reasoning)
- (2023-07) A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis (grounding, reasoning)
- (2023-07) RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control (grounding)
- (2023-07) RoCo: Dialectic Multi-Robot Collaboration with Large Language Models (grounding)
- (2023-07) Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners (grounding)
- (2023-07) S$^3$: Social-network Simulation System with Large Language Model-Empowered Agents (grounding, reasoning)
- (2023-07) ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (grounding, reasoning, retrieval)
- (2023-07) Understanding the Benefits and Challenges of Using Large Language Model-based Conversational Agents for Mental Well-being Support (grounding)
- (2023-07) Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration (grounding, reasoning)
- (2023-07) WebArena: A Realistic Web Environment for Building Autonomous Agents (environment)
- (2023-08) AgentBench: Evaluating LLMs as Agents (environment)
- (2023-08) AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents (environment)
- (2023-08) AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework (grounding, reasoning)
- (2023-08) CGMI: Configurable General Multi-Agent Interaction Framework (grounding, reasoning)
- (2023-08) ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate (grounding, reasoning)
- (2023-08) Cumulative Reasoning with Large Language Models (reasoning)
- (2023-08) ExpeL: LLM Agents Are Experiential Learners (grounding, reasoning, retrieval, learning)
- (2023-08) GPT-in-the-Loop: Adaptive Decision-Making for Multiagent Systems (grounding, reasoning)
- (2023-08) Gentopia: A Collaborative Platform for Tool-Augmented LLMs (environment)
- (2023-08) MetaGPT: Meta Programming for Multi-Agent Collaborative Framework (grounding, reasoning)
- (2023-08) ProAgent: Building Proactive Cooperative AI with Large Language Models (grounding, reasoning)
- (2023-08) Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization (grounding, reasoning, learning)
- (2023-08) SAPIEN: Affective Virtual Agents Powered by Large Language Models (grounding, reasoning)
- (2023-08) Synergistic Integration of Large Language Models and Cognitive Architectures for Robust AI: An Exploratory Analysis (grounding, reasoning, retrieval, learning)
- (2023-09) ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving (grounding, reasoning, learning)
- (2023-09) Identifying the Risks of LM Agents with an LM-Emulated Sandbox (environment)
- (2023-09) Suspicion Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4 (grounding, reasoning)
- (2024-01) Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives (reasoning, reflection)
- (2024-02) Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization (reasoning, reflection, learning)
- (2024-03) LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning. (planning, reasoning)
- (2024-04) Empowering Biomedical Discovery with AI Agents (AI scientist, biomedical research)
- (2024-05) TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models (reasoning, retrieval)
(more to be added soon. pull request welcome.)
Resources
(more to be added soon. pull request welcome.)