Home

Awesome

<div align="center"> <img src="imgs/logo.jpeg" alt="Logo" width="200"> </div> <h2 align="center">:video_game: SpyGame: An Interactive Multi-Agent Framework</h2>

Implementaion of our paper:

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models

:ferris_wheel: Welcome and feel free to try our demo <a href="https://cfce8ddf8ea07aa8d1.gradio.live/">here</a> !

:book: Overview

<p align="center"> <img src='imgs/spygame.jpg' width=600> </p>

:zap: Quickstart

To get started, follow these steps:

  1. Clone the GitHub Repository: Begin by cloning the repository using the command:

    git clone https://github.com/Skytliang/SpyGame.git
    
  2. Set Up Python Environment: Ensure you have a version 3.9 or higher Python environment. You can create and activate this environment using the following commands, replacing SpyGame_conda_env with your preferred environment name:

    conda create -n SpyGame_conda_env python=3.9
    conda activate SpyGame_conda_env
    
  3. Install Dependencies: Move into the SpyGame directory and install the necessary dependencies by running:

    cd SpyGame
    pip3 install -r requirements.txt
    
  4. Set OpenAI API Key: Manually set your actual API key in SpyGame/utils/gpt3_apikeys.json.

  5. Build Your Benchmark: Use the following command to run SpyGame. The complete game process will be saved in SpyGame/benchmark/host_agent/guest_agent:

     sh run_spygame.sh
    
  6. Try our demo: If you just want to have a try, feel free to check our demo here.

Case Study

Note: For the sake of fairness, we randomly shuffle the speaking order of all agents. Please refer to our paper for more details.

In this case, Player 1 <img src='imgs/number/1.png' height=20>, Player 2 <img src='imgs/number/2.png' height=20> and Player 4 <img src='imgs/number/4.png' height=20> are villager players <img src='imgs/angel.png' height=20> with the same word "BERT". Player 3 <img src='imgs/number/3.png' height=20> is the spy player <img src='imgs/devil.png' height=20> with the keyword "GPT". Here, all the four agents are consistently designated as GPT-4 LLMs.

Speaking phase in the first round

Voting phase in the first round

Speaking phase in the second round

Voting phase in the second round

Citation

@misc{liang2023spygame,
      title={Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models}, 
      author={Tian Liang and Zhiwei He and Jen-tse Huang and Wenxuan Wang and Wenxiang Jiao and Rui Wang and Yujiu Yang and Zhaopeng Tu and Shuming Shi and Xing Wang},
      year={2023},
      eprint={2310.20499},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}