Awesome
Progressive-Hint Prompting Improves Reasoning in Large Language Models
<div align="center"> <img src="resources/img.png"> </div> <p align="center"> Figure 1: Progressive-Hint Prompting (PHP) interacts with LLM. </p>PHP: Simple and Effective for improving LLM reasoning ability.<br> Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li and Yu Li<br> Preprint Paper
This repository contains the official Pytorch implementation of code for PHP. PHP is a simple, effective and powerful method, as shown in Figure 1. It is can be easily combined with preivious works such as Chain-of-Thought and Self-Consistency, as they are orthogonal.
News
- [06/05/2023]: With PHP and Self-Consistency (K=40), we achieve new SOTA of GSM8K dataset: 96.5!
- [05/11/2023]: We update the code and data of MATH dataset
- [05/02/2023]: We achieve the SOTA performance 53.9% on MATH dataset, which is the currently most challenging reasoning dataset! We will update our paper and the code soon! Thank you for your attention!!!
- [04/25/2023]: We update the dataset!
- [04/21/2023]: We upload the code!
PaperWithCode Leaderboard
We achieve the SOTA performance on AQuA, SVAMP, GSM8K and MATH dataset, as the shown in SVAMP Leaderboard, GSM8K Leaderboard and MATH Leaderboard (Date: May 05, 2023)
<div align="center"> <img src="resources/leaderboard.png"> </div> <p align="center"> We achieve the SOTA performance on GSM8K dataset. </p> <div align="center"> <img src="resources/leaderboard_math.png"> </div> <p align="center"> We achieve the SOTA performance on MATH dataset. </p>Installation
pip install jsonlines
pip install openai
Usage
The code needs to be configued witt your account' secret key which is available on website.
Set openai.api_key
to its value:
import openai
openai.api_key = "sk-..."
Run
We run the main_clean with the following:
python python main_clean.py --prompt_dir [base prompt] --eng [openAI model] --seed [seed number] --hint [PHP prompt] --dataset [datasetname] --num_test -1 --q1 [ori: standard or CoT, complex: complex CoT] --q2 [ori: standard or CoT, complex: complex CoT] --sample [sample number] --temp [0.0 for greedy, 0.7 for sc]
Or, we can just use the file in bash directory:
bash bash/cot.sh
Result
We Only show Complex CoT with GPT-3.5-Turbo and GPT-4 Here. For more experiments results (such as text-davinci-002 and text-davinci-003), please refer to our Paper.
<div align="center"> <img src="resources/table_8.png"> </div> <p align="center"> Table 8: Progressive-Hint Prompting (PHP) with GPT-3.5-Turbo and GPT-4. PHP works better when the model is more powerful. </p>Citation
@article{zheng2023progressive,
title={Progressive-hint prompting improves reasoning in large language models},
author={Zheng, Chuanyang and Liu, Zhengying and Xie, Enze and Li, Zhenguo and Li, Yu},
journal={arXiv preprint arXiv:2304.09797},
year={2023}
}