Home

Awesome

<div align="center"> <img src="pics/logo1.png" style="width: 20%;height: 10%"> <h1> AgentSquare: Automatic LLM Agent Search In Modular Design Space </h1> </div> <div align="center">

Code License Python 3.8+

</div> <div align="center"> <!-- <a href="#model">Model</a> โ€ข --> ๐ŸŒ <a href="https://tsinghua-fib-lab.github.io/AgentSquare_website">Website</a> | ๐Ÿ“ƒ <a href="https://arxiv.org/abs/2410.06153">Paper</a> | </div>

AgentSquare

The official implementation for paper AgentSquare: Automatic LLM Agent Search in Modular Design Space with code, prompts and results.

<p float="left"> <img src="pics/demo-v2.gif"> </p>

intro

๐ŸŽ‰ News

๐ŸŒŽ Setup

  1. Set up OpenAI API key and store in environment.
export OPENAI_API_KEY=<YOUR KEY HERE>
  1. Install dependencies
git clone https://github.com/tsinghua-fib-lab/AgentSquare.git
conda create -n agentsquare python=3.9.12
conda activate agentsquare
cd AgentSquare
pip install -r requirements.txt

๐Ÿš€ Quick Start: Demo with ALFWorld

https://github.com/user-attachments/assets/23090869-8c60-4ee8-98ec-75dd6f4255a0

An exemplar script combining different agent modules to solve the task of ALFworld:

export ALFWORLD_DATA=<Your path>/AgentSquare/tasks/alfworld
cd tasks/alfworld
sh run.sh or 
python3 alfworld_run.py \
    --planning deps\
    --reasoning cot\
    --tooluse none\
    --memory dilu\
    --model gpt-3.5-turbo-0125 \

๐Ÿ”Ž Run Other Tasks

Install dependencies

cd tasks
pip install -r requirements.txt
<details> <summary> Webshop </summary>

Install webshop environment following instructions here and launch the WebShop webpage.

cd tasks/webshop
sh run.sh
</details> <details> <summary> M3Tooleval </summary>
cd tasks/m3tooleval
sh run.sh
</details> <details> <summary> Sciworld </summary>

Install Sciworld environment following instructions here .

cd tasks/sciworld/agentboard
python3 eval_main_sci.py \
    --cfg-path ../eval_configs/main_results_all_tasks.yaml     --tasks scienceworld     --wandb     --log_path ../results/gpt-4o-2024-08-06    --project_name evaluate-gpt-4o-2024-08-06     --baseline_dir ../data/baseline_results \
    --model gpt-4o-2024-08-06 \
    --planning none \
    --reasoning cot \
    --tooluse none \
    --memory none \
</details>

๐ŸŒŸ Modular Design Challenge

We kindly invite you to participate in the modular design challenge by standardizing your LLM agents with our recommended I/O interfaces. Let's work together to offer a platform for fully exploiting the potential of successful agent designs and consolidating the collective efforts of LLM agent research community!

Contribute New Modules

For guidance on standardizing the I/O interfaces of the four types of agent modules, please refer to module pools, which provides some existing modules, along with a complete interface description available in module interface description. Click here for a detailed procedure. You can submit your standardized modules through this link. The .py file format is preferred, examples can be seen in the modules folder. We will check your submission timely, once approved we will cite and acknowledge your works in this repository.

๐Ÿ’ก How to Add Your Own Task

You can refer to the workflow.py to integrate it with your encapsulated tasks, just like in tasks/alfworld.

Citations

Please considering citing our paper and staring this repo if you use AgentSquare and find it useful, thanks! Feel free to contact fenglixu@tsinghua.edu.cn or open an issue if you have any question.

@article{shang2024agentsquare,
  title={AgentSquare: Automatic LLM Agent Search in Modular Design Space},
  author={Shang, Yu and Li, Yu and Zhao, Keyu and Ma, Likai and Liu, Jiahe and Xu, Fengli and Li, Yong},
  journal={arXiv preprint arXiv:2410.06153},
  year={2024}
}