Home

Awesome

AgentPro

arXiv QbitAI Jiangmen Ventures

🎆 [New 0517] Our paper is accepted by ACL 2024 Main

🎆 [New 0511] Agent-Pro is presented in ICLR 2024 LLMAgents Workshop, Vienna.

🎆 [New 0326] Our article has been noticed and reported by Jiangmen Ventures(ć°†é—šćˆ›æŠ•). (https://mp.weixin.qq.com/s/gD4pZc6pvX8f_62uiPJacg)

🎆 [New 0301] Agent-Pro is accepted by ICLR 2024 LLMAgents Workshops as a Poster paper.(https://llmagents.github.io/)

🎆 [New 0227] Our article has been noticed and reported by QbitAI(é‡ć­äœ): QbitAI Article. <video src="https://github.com/zwq2018/Agent-Pro/assets/44236100/95fcde18-ffbb-48da-8917-81e1af74b0c3" width="640" height="480" autoplay loop></video>

AgentPro, built upon RLCard, seamlessly connects to large models like GPT, LLama, QWEN, and more. These interfaces facilitate the integration of RLCard's functionalities with robust language models, enabling advanced applications in natural language processing and reinforcement learning.

See our paper: Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization, Wenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen, Guiyang Hou, Zeqi Tan, Peng Li, Yueting Zhuang, Weiming Lu

Installation

Ensure that you have Python 3.6+ and pip installed. Additionally, confirm that your Python environment includes the PyTorch, OpenAI, and RLCard libraries before proceeding with the installation of AgentPro.

1. Install PyTorch

You can follow the official PyTorch installation guide to install PyTorch. Or you can choose your preferred version and complete the installation yourself.

pip3 install torch

2. Install RLCard

You can visit the official RLCard website at https://github.com/datamllab/rlcard to access RLCard-related files and find more information about the library.

Here is the same installation method as the official website:

pip3 install rlcard

3. Install AgentPro

First, you should clone the code from github as follow:

git clone https://github.com/zwq2018/Agent-Pro.git

Then install with

cd Agent-Pro
pip3 install .

Available Environments

RLCard provide a complexity estimation for the games on several aspects. InfoSet Number: the number of information sets; InfoSet Size: the average number of states in a single information set; Action Size: the size of the action space. Name: the name that should be passed to rlcard.make to create the game environment. We also provide the link to the documentation and the random example.

GameInfoSet NumberInfoSet SizeAction SizeName
Blackjack (wiki)10^310^110^0blackjack
Limit Texas Hold'em (wiki)10^1410^310^0limit-holdem

Code Run Instructions

Blackjack

If you intend to reproduce content related to Blackjack in paper, please utilize the following code snippet:

from play_blackjack_game import play

if __name__ == "__main__":
    number_of_game = 2
    model = 'Qwen'
    game_style = 'ReAct'
    storage_name = "Qwen Play ReAct Blackjack"

    play(number_of_game,model,game_style,storage_name)

Before running, you also need to fill in the corresponding Key into the YOUR KEY field in API.py. Taking GPT-4 as an example, you can adjust the parameters of the model here.

class GPT4API:
    def __init__(self) -> None:
        openai.api_key = "YOUR KEY"

    def response(self, mes):
        response = openai.ChatCompletion.create(
            model='MODEL NAME',
            messages=mes,
            top_p=0.95,
            temperature=1,
        )
        return response.get("choices")[0]["message"]["content"]

Limit Texas Hold`em

If you intend to reproduce content related to Limit Texas Hold'em in paper, please utilize the following code snippet:

from AgentPro import reproduce

self_model = "I should be radical."
mode = 0
key = ""
reproduce(self_model, mode, key)

To integrate an LLM into a custom game, you need to create an LLM_Agent. Here is an example:

from AgentPro import LimitTexasHoldemAgent

index_player = 3
LLM_model = "gpt-3.5-turbo"
key = ""
config = {
    "is_self_model": True,
    "is_believe": True,
    "is_analogy": True,
    "is_summarize": True
}
LLM_agent = LimitTexasHoldemAgent(index_player=index_player,
                                  LLM_model=LLM_model,
                                  key=key,
                                  config=config)
LLM_agent.init_self_model("I should be radical.")

After creating all the agents, you'll need to set up a Limit Texas Hold'em game environment using a method similar to the following:

import rlcard
from rlcard.agents import RandomAgent
from AgentPro import AgentEnv, LimitTexasHoldemAgent

version = "test"
num_players = 3
index_player = 2
LLM_model = "gpt-3.5-turbo"
key = ""
config = {
    "is_self_model": True,
    "is_believe": True,
    "is_analogy": True,
    "is_summarize": True
}
self_model = "I should be radical."

env = rlcard.make("limit-holdem", config={
    "game_num_players": num_players,
})
game = AgentEnv(version)
LLM_agent = LimitTexasHoldemAgent(index_player=index_player,
                                  LLM_model=LLM_model,
                                  key=key,
                                  config=config)
LLM_agent.init_self_model(self_model)
random_agent = RandomAgent(num_actions=env.num_actions)

agents = [random_agent, random_agent, LLM_agent]
game.init(agents)

env.set_agents([game] * len(agents))

t, p = env.run(is_training=False)
t, p = game.reorder_tp(t, p)

game.save_result(p)
game.update_card(t, p)

gi = game.generate_game_result(t, p)

game.summarize(gi)
game.save_game_result(gi)
game.reset_game()

Library Structure

The purposes of the main modules are listed as below:

Citation

@inproceedings{zhang-etal-2024-agent,
    title = "Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization",
    author = "Zhang, Wenqi  and
      Tang, Ke  and
      Wu, Hai  and
      Wang, Mengna  and
      Shen, Yongliang  and
      Hou, Guiyang  and
      Tan, Zeqi  and
      Li, Peng  and
      Zhuang, Yueting  and
      Lu, Weiming",
    booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.acl-long.292",
    pages = "5348--5375"
}
@inproceedings{zhang-etal-2024-self-contrast,
    title = "Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives",
    author = "Zhang, Wenqi  and
      Shen, Yongliang  and
      Wu, Linjuan  and
      Peng, Qiuying  and
      Wang, Jun  and
      Zhuang, Yueting  and
      Lu, Weiming",
    booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.acl-long.197",
    pages = "3602--3622"
}