HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Progressive-Hint Prompting Improves Reasoning in Large Language Models

Chuanyang Zheng Zhengying Liu Enze Xie Zhenguo Li Yu Li

Progressive-Hint Prompting Improves Reasoning in Large Language Models

Abstract

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (89.1% -> 91.9%), GSM8K (92% -> 95.5%), AQuA (76.4% -> 79.9%) and MATH (50.3% -> 53.9%).

Code Repositories

chuanyang-Zheng/Progressive-Hint
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
math-word-problem-solving-on-mathPHP (GPT-4 model)
Accuracy: 53.9
math-word-problem-solving-on-svampGPT-4 (PHP)
Execution Accuracy: 91.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp