Awesome
<div align="center"> <img src="pics/logo1.png" style="width: 20%;height: 10%"> <h1> AgentSquare: Automatic LLM Agent Search In Modular Design Space </h1> </div> <div align="center"></div> <div align="center"> <!-- <a href="#model">Model</a> โข --> ๐ <a href="https://tsinghua-fib-lab.github.io/AgentSquare_website">Website</a> | ๐ <a href="https://arxiv.org/abs/2410.06153">Paper</a> | </div>
AgentSquare
The official implementation for paper AgentSquare: Automatic LLM Agent Search in Modular Design Space with code, prompts and results.
<p float="left"> <img src="pics/demo-v2.gif"> </p>๐ News
- [2024.11.07]๐ฅProvide demos of AgentSquare.
- [2024.10.10]๐ฅRelease the source code and our searched new modules.
- [2024.10.08]๐ฅRelease the full paper AgentSquare: Automatic LLM Agent Search in Modular Design Space!
๐ Setup
- Set up OpenAI API key and store in environment.
export OPENAI_API_KEY=<YOUR KEY HERE>
- Install dependencies
git clone https://github.com/tsinghua-fib-lab/AgentSquare.git
conda create -n agentsquare python=3.9.12
conda activate agentsquare
cd AgentSquare
pip install -r requirements.txt
๐ Quick Start: Demo with ALFWorld
https://github.com/user-attachments/assets/23090869-8c60-4ee8-98ec-75dd6f4255a0
An exemplar script combining different agent modules to solve the task of ALFworld:
export ALFWORLD_DATA=<Your path>/AgentSquare/tasks/alfworld
cd tasks/alfworld
sh run.sh or
python3 alfworld_run.py \
--planning deps\
--reasoning cot\
--tooluse none\
--memory dilu\
--model gpt-3.5-turbo-0125 \
๐ Run Other Tasks
Install dependencies
cd tasks
pip install -r requirements.txt
<details>
<summary> Webshop </summary>
Install webshop
environment following instructions here and launch the WebShop
webpage.
cd tasks/webshop
sh run.sh
</details>
<details>
<summary> M3Tooleval </summary>
cd tasks/m3tooleval
sh run.sh
</details>
<details>
<summary> Sciworld </summary>
Install Sciworld
environment following instructions here .
cd tasks/sciworld/agentboard
python3 eval_main_sci.py \
--cfg-path ../eval_configs/main_results_all_tasks.yaml --tasks scienceworld --wandb --log_path ../results/gpt-4o-2024-08-06 --project_name evaluate-gpt-4o-2024-08-06 --baseline_dir ../data/baseline_results \
--model gpt-4o-2024-08-06 \
--planning none \
--reasoning cot \
--tooluse none \
--memory none \
</details>
๐ Modular Design Challenge
We kindly invite you to participate in the modular design challenge by standardizing your LLM agents with our recommended I/O interfaces. Let's work together to offer a platform for fully exploiting the potential of successful agent designs and consolidating the collective efforts of LLM agent research community!
Contribute New Modules
For guidance on standardizing the I/O interfaces of the four types of agent modules, please refer to module pools, which provides some existing modules, along with a complete interface description available in module interface description. Click here for a detailed procedure. You can submit your standardized modules through this link. The .py file format is preferred, examples can be seen in the modules
folder. We will check your submission timely, once approved we will cite and acknowledge your works in this repository.
๐ก How to Add Your Own Task
You can refer to the workflow.py
to integrate it with your encapsulated tasks, just like in tasks/alfworld
.
Citations
Please considering citing our paper and staring this repo if you use AgentSquare and find it useful, thanks! Feel free to contact fenglixu@tsinghua.edu.cn or open an issue if you have any question.
@article{shang2024agentsquare,
title={AgentSquare: Automatic LLM Agent Search in Modular Design Space},
author={Shang, Yu and Li, Yu and Zhao, Keyu and Ma, Likai and Liu, Jiahe and Xu, Fengli and Li, Yong},
journal={arXiv preprint arXiv:2410.06153},
year={2024}
}