Home

Awesome

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

This repo provides the source code & data of our paper: ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models (Arxiv 2023).

@InProceedings{Chen-ChatCot-2023,
      title = {ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models}, 
      author = {Zhipeng Chen and Kun Zhou and Beichen Zhang and Zheng Gong and Wayne Xin Zhao and Ji-Rong Wen},
      year = {2023},
      eprint = {2305.14323},
      archivePrefix = {arXiv},
      primaryClass = {cs.CL}
}

Data & Code

The hotpotqa/ folder is similar with math/ folder

Usage

Prepare

You can use following scripts to install related python package through pip:

git clone https://github.com/RUCAIBox/ChatCoT.git
cd ChatCoT
pip install -r requirements.txt

Inference

You can run ChatCot on the sub-task of MATH dataset by running run_turbo_chatcot.sh:

cd math
bash scripts/run_turbo_chatcot.sh

You have to replace YOUR_API_KEY with you openai api key in the code. Specially, we run ChatCoT through multi-processing, and you should prepare a list of api key in order to run the code correctly.

Evaluate

You can evaluate the results by running eval.sh:

cd math
bash scripts/eval.sh

Results

Main Results

MethodsAlgebraCPPCPAGeometryIANT
CoT48.1031.4321.0656.6022.3418.2729.07
CoT w/ Tool35.8922.579.3440.5313.579.4119.44
CoT w/ Retri<u>52.74</u><u>32.70</u><u>18.86</u><u>58.44</u><u>29.23</u>19.93<u>31.67</u>
ChatCoT56.1134.1823.8159.2429.85<u>19.49</u>32.59
MethodsHotpotQA
CoT37.99
CoT w/ Tool31.42
ChatCoT w/o Feedback<u>53.79</u>
ChatCoT59.16

Ablation Study

MethodsPCGeoNT
ChatCoT23.8129.8532.59
ChatCoT w/o TK23.2629.2330.56
ChatCoT w/o RATK19.9627.3530.93
ChatCoT w/o MRF21.6124.2232.22

The results of ablation study. TK, RATK, and MRF denote if using tool knowledge, retrieval-augmented task knowledge, and multi-turn reasoning format at early turns of the conversation, respectively.

Combining CoT Improvement Strategies

MethodsCPNT
CoT31.4329.07
CoT + SC35.2334.44
ChatCoT34.1832.59
ChatCoT + SC40.0838.33