Toga
21 May 2022
TL;DR: TOGA as discussed by the authors is a transformer-based neural approach to infer exceptional and assertion test oracles based on the context of the focal method, which can handle ambiguous or missing documentation, and even units with a missing implementation.
read more
Abstract: Testing is widely recognized as an important stage of the software development lifecycle. Effective software testing can provide benefits such as bug finding, preventing regressions, and documentation. In terms of documentation, unit tests express a unit's intended functionality, as conceived by the developer. A test oracle, typically expressed as an condition, documents the intended behavior of a unit under a given test prefix. Synthesizing a functional test oracle is a challenging problem, as it must capture the intended functionality rather than the implemented functionality. In this paper, we propose TOGA (a neural method for Test Oracle GenerAtion), a unified transformer-based neural approach to infer both exceptional and assertion test oracles based on the context of the focal method. Our approach can handle units with ambiguous or missing documentation, and even units with a missing implementation. We evaluate our approach on both oracle inference accuracy and functional bug-finding. Our technique improves accuracy by 33\% over existing oracle inference approaches, achieving 96\% overall accuracy on a held out test dataset. Furthermore, we show that when integrated with a automated test generation tool (EvoSuite), our approach finds 57 real world bugs in large-scale Java programs, including 30 bugs that are not found by any other automated testing method in our evaluation.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation
Max Schäfer,Sarah Nadi,Aryaz Eghbali,Frank Tip +3 more
TL;DR: T est P ilot is an LLM-based tool that automatically generates unit tests for JavaScript functions without requiring additional training or manual effort. It achieves high statement and branch coverage and generates tests that are dissimilar from existing tests.
55
Effective test generation using pre-trained Large Language Models and mutation testing
Arghavan Moradi Dakhel,Amin Nikanjam,Vahid Majdinasab,Foutse Khomh,Michel C. Desmarais +4 more
TL;DR: MuTAP improves the effectiveness of test cases generated by LLMs by leveraging mutation testing. It generates effective test cases in the absence of natural language descriptions of the PUTs. MuTAP achieves a Mutation Score (MS) of 93.57% on synthetic buggy code, outperforming all other approaches in the evaluation.
12
Learning Deep Semantics for Test Completion
Pengyu Nie,Rahul Banerjee,Junyi Jessy Li,Raymond J. Mooney,Milos Gligoric +4 more
- 01 May 2023
TL;DR: Learning deep semantics for test completion is a novel task that leverages deep learning to assist developers in writing tests. The key insight is that predicting the next statement in a test method requires reasoning about code execution. Teco, a deep learning model, uses code semantics data to achieve high accuracy in test completion.
Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases
Ze Tang,Jidong Ge,Shangqing Liu,Tingwei Zhu,Tongwen Xu,LiGuo Huang,Bin Luo +6 more
- 11 Sep 2023
TL;DR: Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases proposes a novel approach to adapt language models for code completion in specific domains without fine-tuning. The proposed model, $k$ NM-LM, integrates domain knowledge into language models without fine-tuning and is able to automatically adapt to different language models and domains.
11
PyDex: Repairing Bugs in Introductory Python Assignments using LLMs
Jialu Zhang,José Cambronero,Sumit Gulwani,Vu Le,Ružica Piskač,Gustavo Soares,Gust Verbruggen +6 more
TL;DR: PyDex is an APR system that uses a large language model to fix bugs in introductory Python assignments. It combines multi-modal prompts, iterative querying, test-case-based selection of few-shots, and program chunking to fix both syntactic and semantic mistakes.
11
References
CUTE: a concolic unit testing engine for C
Koushik Sen,Darko Marinov,Gul Agha +2 more
- 01 Sep 2005
TL;DR: In this paper, the authors address the problem of automating unit testing with memory graphs as inputs, and develop a method to represent and track constraints that capture the behavior of a symbolic execution of a unit with memory graph as inputs.
Defects4J: a database of existing faults to enable controlled testing studies for Java programs
René Just,Darioush Jalali,Michael D. Ernst +2 more
- 21 Jul 2014
TL;DR: Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research, and provides a high-level interface to common tasks in softwareTesting research, making it easy to con- duct and reproduce empirical studies.
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Zhangyin Feng,Daya Guo,Duyu Tang,Nan Duan,Xiaocheng Feng,Ming Gong,Linjun Shou,Bing Qin,Ting Liu,Daxin Jiang,Ming Zhou +10 more
- 19 Feb 2020
TL;DR: CodeBERT as mentioned in this paper is a pre-trained model for natural language code search and code documentation generation with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators.
The Daikon system for dynamic detection of likely invariants
Michael D. Ernst,Jeff H. Perkins,Philip J. Guo,Stephen McCamant,Carlos Pacheco,Matthew S. Tschantz,Chen Xiao +6 more
TL;DR: Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants, a property that holds at a certain point or points in a program.
1.2K
Pex: white box test generation for .NET
Nikolai Tillmann,Jonathan de Halleux +1 more
- 09 Apr 2008
TL;DR: Pex automatically produces a small test suite with high code coverage for a .NET program by performing a systematic program analysis using dynamic symbolic execution, similar to path-bounded model-checking, to determine test inputs for Parameterized Unit Tests.