8 months ago

Abstract

The strong performance of large language models (LLMs) raises extensivediscussion on their application to code generation. Recent research suggestscontinuous program refinements through visible tests to improve code generationaccuracy in LLMs. However, these methods suffer from LLMs' inefficiency andlimited reasoning capacity. In this work, we propose an LLM programmingworkflow (LPW) designed to improve both initial code generation and subsequentrefinements within a structured two-phase workflow. Specifically, the solutiongeneration phase formulates a solution plan, which is then verified throughvisible tests to specify the intended natural language solution. Subsequently,the code implementation phase drafts an initial code according to the solutionplan and its verification. If the generated code fails the visible tests, theplan verification serves as the intended solution to consistently inform therefinement process for correcting bugs. Compared to state-of-the-art methodsacross various existing LLMs, LPW significantly improves the Pass@1 accuracy byup to 16.4% on well-established text-to-code generation benchmarks. LPW alsosets new state-of-the-art Pass@1 accuracy, achieving 98.2% on HumanEval, 84.8%on MBPP, 59.3% on LiveCode, 62.6% on APPS, and 34.7% on CodeContest, usingGPT-4o as the backbone. Our code is publicly available at:https://github.com/you68681/lpw

Source PDF