Home

Awesome

Cumulative Reasoning With Large Language Models

PWC arXiv Python 3.10

Homepage: https://cumulative-reasoning.github.io.

Introduction

Official implementation of paper "Cumulative Reasoning with Large Language Models" (https://arxiv.org/abs/2308.04371).

Installation

conda create -n cr python==3.10
conda activate cr
pip install -r requirements.txt

For more usage help, please refer to the README.md in each subdirectory.

CR Agent: Solving MATH Problems with Code Environment

please see the ./CR-Agent folder for the output log and prompts on the MATH dataset, we have released the code for CR Agent v0.1 (a minimalist implementation based on ToRA).

Experimental Results

In this section, we employed GPT-4-1106-preview with a Python code environment, devoid of additional tools like external memory and retrieval systems. The experiment involved a minimalist setup where only one reasoning context session was utilized. This session was managed by simply accumulating and concatenating the context string, and the entire process was executed using a single LLM without the assistance of a verifier LLM. Notably, the implementation was carried out purely using Python strings, without leveraging any specialized frameworks such as Langchain or guidance.

The outcomes of this experimental setup revealed noteworthy results:

Category-wise Scores

MethodAlgebraCounting & ProbabilityGeometryIntermediate AlgebraNumber TheoryPrealgebraPrecalculus
PAL (PoT)65.357.931.730.966.173.223.2
ToRA71.868.448.849.566.167.144.6
CR Agent86.371.153.751.588.786.651.8

Difficulty Level Scores

MethodLevel 1Level 2Level 3Level 4Level 5
PAL (PoT)88.465.660.045.331.3
ToRA74.475.669.553.946.3
CR Agent90.790.081.966.452.2

The asterisks highlight the best-performing method in each category and difficulty level, clearly indicating the superiority of the CR Agent in this experimental setup.

These tables provide a comprehensive view of the performance of each method across various categories and difficulty levels in the MATH dataset. The CR Agent shows marked improvements in most categories and levels, illustrating its robustness and effectiveness in solving complex mathematical problems, even within the constraints of a simplified experimental setup.

CR Agent Assistant v0.1 based on Meta Prompting

see ./CR-Agent-Assistant/cr-agent-assistant-v0.1.md for a minimalist implementation based on OpenAI Assistant API.

See https://chat.openai.com/g/g-L3a4ZCIHx-cr-agent-v0-1 for an online demo.

Meta Prompting (General Definition): Meta Prompting is a prompting technique inspired by type theory, emphasizing the structure and syntax of examples rather than their detailed content. It's an approach where the focus is on presenting the outline or framework of a problem or topic, offering a scaffold that can be filled with specific details as needed. This technique is particularly useful in situations where understanding the form and pattern of a problem or solution is more crucial than the specific content.

Revisiting Game of 24

We have implemented the CR Agent using pure Meta Prompting to let the AI Agent directly write a Python program to solve the Game of 24 tasks, and process all samples in one response, n time faster than previous methods. Please see https://github.com/meta-prompting/meta-prompting for details.

<center>

MP-CR-Agent-XML v0.2 Success Rate: 100%, Time usage: 0.08s per sample.

</center>

Acknowledgement

This repo is mainly based on Guidance, HuggingFace, Tree of Thoughts and ToRA. Thanks for their wonderful work!

Citations

Please cite the paper and star this repo if you use Cumulative Reasoning (CR) and find it interesting/useful, thanks! Feel free to contact zhangyif21@mails.tsinghua.edu.cn | yangjq21@mails.tsinghua.edu.cn or open an issue if you have any questions.

@article{zhang2023cumulative,
  title={Cumulative Reasoning With Large Language Models},
  author={Zhang, Yifan and Yang, Jingqin and Yuan, Yang and Yao, Andrew Chi-Chih},
  journal={arXiv preprint arXiv:2308.04371},
  year={2023}
}