Coding conventions

Topic Tools

Papers published on a yearly basis

Papers

Journal Article•10.1017/S0305000900006449•

The child language data exchange system.

[...]

Brian MacWhinney¹, Catherine E. Snow²•Institutions (2)

Carnegie Mellon University¹, Harvard University²

01 Jun 1985-Journal of Child Language

TL;DR: The formation of the CHILDES, the governance of the system, the nature of the database, the shape of the coding conventions, and the types of computer programs being developed are detailed.

...read moreread less

Abstract: The study of language acquisition underwent a major revolution in the late 1950s as a result of the dissemination of technology permitting high-quality tape-recording of children in the family setting. This new technology led to major breakthroughs in the quality of both data and theory. The field is now at the threshold of a possible second major breakthrough stimulated by the dissemination of personal computing. Researchers are now able to transcribe tape-recorded data into computer files. With this new medium it is easy to conduct global searches for word combinations across collections of files. It is also possible to enter new codings of the basic text line. Because of the speed and accuracy with which computer files can be copied, it is now much easier to share data between researchers. To foster this sharing of computerized data, a group of child language researchers has established the Child Language Data Exchange System (CHILDES). This article details the formation of the CHILDES, the governance of the system, the nature of the database, the shape of the coding conventions, and the types of computer programs being developed.

...read moreread less

906 citations

Proceedings Article•10.1145/2786805.2786849•

Suggesting accurate method and class names

[...]

Miltiadis Allamanis¹, Earl T. Barr², Christian Bird³, Charles Sutton¹•Institutions (3)

University of Edinburgh¹, University College London², Microsoft³

30 Aug 2015

TL;DR: A neural probabilistic language model for source code that is specifically designed for the method naming problem is introduced, and a variant of the model is introduced that is, to the knowledge, the first that can propose neologisms, names that have not appeared in the training corpus.

...read moreread less

Abstract: Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for methods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional continuous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering.

...read moreread less

517 citations

Proceedings Article•10.1145/2635868.2635883•

Learning natural coding conventions

[...]

Miltiadis Allamanis¹, Earl T. Barr², Christian Bird³, Charles Sutton¹•Institutions (3)

University of Edinburgh¹, University College London², Microsoft³

11 Nov 2014

TL;DR: NATHURALIZE, a framework that learns the style of a codebase, and suggests revisions to improve stylistic consistency is presented, which builds on recent work in applying statistical natural language processing to source code.

...read moreread less

Abstract: Every programmer has a characteristic style, ranging from preferences about identifier naming to preferences about object relationships and design patterns. Coding conventions define a consistent syntactic style, fostering readability and hence maintainability. When collaborating, programmers strive to obey a project’s coding conventions. However, one third of reviews of changes contain feedback about coding conventions, indicating that programmers do not always follow them and that project members care deeply about adherence. Unfortunately, programmers are often unaware of coding conventions because inferring them requires a global view, one that aggregates the many local decisions programmers make and identifies emergent consensus on style. We present NATURALIZE, a framework that learns the style of a codebase, and suggests revisions to improve stylistic consistency. NATURALIZE builds on recent work in applying statistical natural language processing to source code. We apply NATURALIZE to suggest natural identifier names and formatting conventions. We present four tools focused on ensuring natural code during development and release management, including code review. NATURALIZE achieves 94 % accuracy in its top suggestions for identifier names. We used NATURALIZE to generate 18 patches for 5 open source projects: 14 were accepted.

...read moreread less

422 citations

Proceedings Article•10.1145/2635868.2635883•

Learning Natural Coding Conventions

[...]

Miltiadis Allamanis¹, Earl T. Barr², Christian Bird³, Charles Sutton¹•Institutions (3)

University of Edinburgh¹, University College London², Microsoft³

17 Feb 2014-arXiv: Software Engineering

TL;DR: NATURALIZE as mentioned in this paper is a framework that learns the style of a codebase and suggests revisions to improve stylistic consistency, which can even transfer knowledge about coding conventions across projects.

...read moreread less

Abstract: Every programmer has a characteristic style, ranging from preferences about identifier naming to preferences about object relationships and design patterns. Coding conventions define a consistent syntactic style, fostering readability and hence maintainability. When collaborating, programmers strive to obey a project's coding conventions. However, one third of reviews of changes contain feedback about coding conventions, indicating that programmers do not always follow them and that project members care deeply about adherence. Unfortunately, programmers are often unaware of coding conventions because inferring them requires a global view, one that aggregates the many local decisions programmers make and identifies emergent consensus on style. We present NATURALIZE, a framework that learns the style of a codebase, and suggests revisions to improve stylistic consistency. NATURALIZE builds on recent work in applying statistical natural language processing to source code. We apply NATURALIZE to suggest natural identifier names and formatting conventions. We present four tools focused on ensuring natural code during development and release management, including code review. NATURALIZE achieves 94% accuracy in its top suggestions for identifier names and can even transfer knowledge about conventions across projects, leveraging a corpus of 10,968 open source projects. We used NATURALIZE to generate 18 patches for 5 open source projects: 14 were accepted.

...read moreread less

240 citations

Proceedings Article•10.1145/2568225.2568252•

Understanding understanding source code with functional magnetic resonance imaging

[...]

Janet Siegmund¹, Christian Kästner², Sven Apel¹, Chris Parnin³, Anja Bethmann⁴, Thomas Leich, Gunter Saake⁵, André Brechmann⁴ - Show less +4 more•Institutions (5)

University of Passau¹, Carnegie Mellon University², Georgia Institute of Technology³, Leibniz Institute for Neurobiology⁴, Otto-von-Guericke University Magdeburg⁵

31 May 2014

TL;DR: This paper explores whether functional magnetic resonance imaging (fMRI), which is well established in cognitive neuroscience, is feasible to soundly measure program comprehension and finds a clear, distinct activation pattern of five brain regions that fit well to the understanding of program comprehension.

...read moreread less

Abstract: Program comprehension is an important cognitive process that inherently eludes direct measurement. Thus, researchers are struggling with providing suitable programming languages, tools, or coding conventions to support developers in their everyday work. In this paper, we explore whether functional magnetic resonance imaging (fMRI), which is well established in cognitive neuroscience, is feasible to soundly measure program comprehension. In a controlled experiment, we observed 17 participants inside an fMRI scanner while they were comprehending short source-code snippets, which we contrasted with locating syntax errors. We found a clear, distinct activation pattern of five brain regions, which are related to working memory, attention, and language processing---all processes that fit well to our understanding of program comprehension. Our results encourage us and, hopefully, other researchers to use fMRI in future studies to measure program comprehension and, in the long run, answer questions, such as: Can we predict whether someone will be an excellent programmer? How effective are new languages and tools for program understanding? How should we train programmers?

...read moreread less

225 citations

...

Expand

Year	Papers
2021	3
2020	8
2019	6
2018	8
2017	2
2016	5

Topic Tools

Papers published on a yearly basis

Papers

The child language data exchange system.

Suggesting accurate method and class names

Learning natural coding conventions

Learning Natural Coding Conventions

Understanding understanding source code with functional magnetic resonance imaging

Related Topics (5)

Performance Metrics