Journal Article10.48550/arxiv.2404.02575
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Hyungjoo Chae,Yeonghyeon Kim,Seungone Kim,Kai Tzu-iunn Ong,Beong-woo Kwak,Moohyeon Kim,Seonghwan Kim,Taeyoon Kwon,Jiwan Chung,Youngjae Yu,Jinyoung Yeo +10 more
TL;DR: Think-and-Execute framework improves algorithmic reasoning in LLMs by decomposing the reasoning process into task-level logic and instance-specific code execution.
read more
Abstract: Algorithmic reasoning refers to the ability to understand the complex patterns behind the problem and decompose them into a sequence of reasoning steps towards the solution. Such nature of algorithmic reasoning makes it a challenge for large language models (LLMs), even though they have demonstrated promising performance in other reasoning tasks. Within this context, some recent studies use programming languages (e.g., Python) to express the necessary logic for solving a given instance/question (e.g., Program-of-Thought) as inspired by their strict and precise syntaxes. However, it is non-trivial to write an executable code that expresses the correct logic on the fly within a single inference call. Also, the code generated specifically for an instance cannot be reused for others, even if they are from the same task and might require identical logic to solve. This paper presents Think-and-Execute, a novel framework that decomposes the reasoning process of language models into two steps. (1) In Think, we discover a task-level logic that is shared across all instances for solving a given task and then express the logic with pseudocode; (2) In Execute, we further tailor the generated pseudocode to each instance and simulate the execution of the code. With extensive experiments on seven algorithmic reasoning tasks, we demonstrate the effectiveness of Think-and-Execute. Our approach better improves LMs' reasoning compared to several strong baselines performing instance-specific reasoning (e.g., CoT and PoT), suggesting the helpfulness of discovering task-level logic. Also, we show that compared to natural language, pseudocode can better guide the reasoning of LMs, even though they are trained to follow natural language instructions.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague,Fangcong Yin,Juan Diego Rodriguez,Dongwei Jiang,Manya Wadhwa,Prasann Singhal,Xinyu Zhao,Xi Ye,Kyle Mahowald,Gregory Christopher Durrett +9 more
TL;DR: Chain-of-thought (CoT) via prompting significantly improves performance on math and symbolic reasoning tasks, but offers minimal benefits on other tasks, suggesting selective application and a need to explore new paradigms beyond prompt-based CoT.
Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach
Yuting Wan,Chaozheng Wang,Yi Dong,Wenxuan Wang,Shuqing Li,Yintong Huo,Michael R. Lyu +6 more
- 24 Jun 2024
TL;DR: Automatically generating UI code from screenshots is a time-consuming process. DCGen is a divide-and-conquer-based approach that effectively mitigates issues in generating UI code by focusing on smaller visual segments.
Bike Frames: Understanding the Implicit Portrayal of Cyclists in the News
TL;DR: In this paper , the authors explore the perceived perception of cyclists within news headlines and compare and contrast the perceptions of cyclists with motorcyclist-related headlines to ground the findings with another related activity for both male and female-related posts.
1
Meta-Designing Quantum Experiments with Language Models
Sören Arlt,H. M. Duan,F.-Y. Li,Songbo Xie,Yuehan Wu,Mario Krenn +5 more
- 04 Jun 2024
TL;DR: A language model trained on synthetic data generates meta-solutions for designing quantum experiments, producing interpretable code for entire classes of quantum systems and uncovering general design rules for infinitely large classes of quantum states.
Learning to Reason via Program Generation, Emulation, and Search
Nathaniel Weir,Muhammad Khalifa,Linlu Qiu,Orion Weller,P. Clark +4 more
- 25 May 2024
TL;DR: CoGEX extends program synthesis capabilities of LMs to tasks involving commonsense reasoning, moral decision-making, and sarcasm understanding by generating pseudo-programs and searching over them.
References
Proceedings Article
Chain of Thought Prompting Elicits Reasoning in Large Language Models
Jason Loh Seong Wei,Xuezhi Wang,D. Schuurmans,Maarten Bosma,Ed H. Chi,Fei Xia,Quoc Le,Denny Zhou +7 more
- 28 Jan 2022
TL;DR: Experiments on three large language models show that chain-of-thought prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks.
Proceedings Article
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima,Shixiang Gu,Machel Reid,Yutaka Matsuo,Yusuke Iwasawa +4 more
- 24 May 2022
TL;DR: Experimental results demonstrate that the Zero-shot-CoT, using the same single prompt template, significantly outperforms zero-shot LLM performances on diverse benchmark reasoning tasks including arithmetics, symbolic reasoning, and other logical reasoning tasks, without any hand-crafted few-shot examples.
•Posted Content
HellaSwag: Can a Machine Really Finish Your Sentence?.
TL;DR: HellaSwag as discussed by the authors ) is a commonsense NLP dataset where a series of discriminators iteratively select an adversarial set of machine-generated wrong answers, and the key insight is to scale up the length and complexity of the dataset examples towards a critical 'Goldilocks' zone where generated text is ridiculous to humans, yet often misclassified by state-of-the-art models.
1.2K
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
TL;DR: Wenhuchen et al. as discussed by the authors proposed the Program of Thoughts (PoT) model to disentangle computation from reasoning, which uses language models (mainly Codex) to express the reasoning process as a program.
PAL: Program-aided Language Models
Luyu Gao,Aman Madaan,Shuyan Zhou,Uri Alon,Pengfei Liu,Yiming Yang,Jamie Callan,Graham Neubig +7 more
TL;DR: Program-Aided Language Models (PAL) as discussed by the authors uses the LLM to read natural language problems and generate programs as the intermediate reasoning steps, but offloads the solution step to a runtime such as a Python interpreter.