Home

Awesome

<div align="center"> <img src="imgs/logo2.png" alt="Logo" width="200"> </div> <h2 align="center">⚖️ MAD: Multi-Agent Debate</h2>

:fire:This work aims to explore the debating capability of LLMs by proposing the MAD framework, which stands for Multi-Agents Debate.

"Truth emerges from the clash of adverse ideas."<br> "真理越辩越明。"

<!-- "Good Luck!" -- wxjiao ---> <!-- "Good Luck!" -- zwhe99 ---> <!-- "Good Luck!" -- xing --->

Brief Introduction

The cognitive behavior of large language models (LLMs) has garnered significant attention in recent times. For example, self-reflection, a concept that usually refers to the process of introspection and examination of a person's own thoughts, has also been demonstrated effective with LLMs in solving challenging NLP tasks. However, we point out that self-reflection can easily fall into the degeneration of thoughts (DoT) issue in the follow scenarios:

<div align="center"> <img width="45%" alt="MAD" src="imgs/image.png" /> <p class="image-caption">Figure 1: Comparison between debate and reflection.</p> </div>

In this project, we have embarked on a journey to explore the potential of a debating interaction framework among LLMs. With MAD, the nature of agents being in the state of 'tit for tat' determines that (1) the distorted thinking of one agent can be corrected by the other one :grinning:; (2) the resistance to change of one agent will be complemented by the other one :smile:; and (3) either agent can provide external feedback for each other :laughing:.

Obviously, MAD is less likely to have the DoT issue and can exploit more potential of LLMs. Experiments show that MAD brings significant and consistent improvements on Counterintuitive QA and Commonsense-MT tasks.

JOIN US on this journey of exploring the interaction and debating capability with LLMs. :rocket::rocket::rocket:

Framework

<div align="center"> <img width="90%" alt="MAD" src="imgs/framework.png" /> <p class="image-caption">Figure 2: Framework of Multi-Agent Debate. Here we designate the devil (<img src="imgs/devil.png" width="25" />) as the affirmative side while the angel (<img src="imgs/angel.png" width="25" />) as the negative side. We want the angel to correct the devil’s mistakes..</p> </div>

Run

Preparation

pip3 install -r requirements.txt

Run MAD

sh debate4tran.sh 

Run Interactive

If you just want to have a try, you can try the interactive script on your PC.

python3 interactive.py

Or simply try our demo for translation here.

Main Results

Counterintuitive QA

<div align="center"> <img width="35%" alt="CounterintuitiveQA" src="imgs/CounterintuitiveQA.png" /> <p class="image-caption">Table 1: Reasoning accuracy on Counter-Intuitive AR.</p> </div>
Case 1

When Alice walks up the hill, her speed is 1 m/s and when she goes down the hill, her speed is 3 m/s. Then when Alice walks up and down the hill, what is her average speed? (1.5m/s)

MAD
<div align="center"> <img width="40%" alt="MAD" src="https://github.com/Skytliang/Multi-Agents-Debate/blob/main/imgs/mad_qa_case1.gif" /> <p class="image-caption">Figure 3: An Animation to Show the Process of MAD.</p> </div> <details> <summary><b>Debate process</b></summary> </details> <details> <summary><b>Case 2</b></summary> We have 1 ton apples which contain 90% water. After a few days, those apples only have 80% water. What is the weight of those apples now? (0.5ton)
MAD
</details>

Commonsense Machine Translation

<div align="center"> <img width="50%" alt="CommonMT" src="imgs/CommonMT.png" /> <p class="image-caption">Table 2: Translation performance on Common MT.</p> </div>
Case 1

Given the Chinese sentence "吃掉敌人一个师。", please provide its translation in English.

MAD
<p align="center"> <img src="imgs/translation-case1.png" width="450" /> </p> <details> <summary><b>Case 2</b></summary> Given the Chinese sentence "他从后门搞到了不少名酒。", please provide its translation in English.
MAD
<p align="center"> <img src="imgs/translation-case2.png" width="750" /> </p> </details>

Reference

Citation

@article{liang2023encouraging,
  title={Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate},
  author={Liang, Tian and He, Zhiwei and Jiao, Wenxiang and Wang, Xing and Wang, Yan and Wang, Rui and Yang, Yujiu and Tu, Zhaopeng and Shi, Shuming},
  journal={arXiv preprint arXiv:2305.19118},
  year={2023}
}