Debug Like a Human: A Large Language Model Debugger Via Verifying Runtime Execution Step-by-step

Findings of the Association for Computational Linguistics ACL 2024(2024)

引用 0|浏览24
暂无评分
摘要
Large language models (LLMs) are leading significant progress in codegeneration. Beyond one-pass code generation, recent works further integrateunit tests and program verifiers into LLMs to iteratively refine the generatedprograms. However, these works consider the generated programs as anindivisible entity, which falls short for LLMs in debugging the programs,especially when the programs contain complex logic flows and data operations.In contrast, when human developers debug programs, they typically setbreakpoints and selectively examine runtime execution information. Theexecution flow and the intermediate variables play a crucial role in thedebugging process, yet they are underutilized in the existing literature oncode generation. In this study, we introduce Large Language Model Debugger(LDB), a novel debugging framework that enables LLMs to refine their generatedprograms with the runtime execution information. Specifically, LDB segments theprograms into basic blocks and tracks the values of intermediate variablesafter each block throughout the runtime execution. This allows LLMs toconcentrate on simpler code units within the overall execution flow, verifytheir correctness against the task description block by block, and efficientlypinpoint any potential errors. Experiments demonstrate that LDB consistentlyenhances the baseline performance by up to 9.8TransCoder benchmarks, archiving new state-of-the-art performance in codedebugging for various LLM selections.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要