Journal Article10.1145/3611643.3616280
How Practitioners Expect Code Completion?
Chaozheng Wang,Junhao Hu,Cuiyun Gao,Yu Jin,Tao Xie,Hailiang Huang,Zhenyu Lei,Yuetang Deng +7 more
- 30 Nov 2023
9
TL;DR: A literature review of papers on code completion published in major publication venues from 2012 to 2022 highlights the directions desirable for researchers to invest efforts toward developing code completion techniques for meeting practitioner expectations.
read more
Abstract: Code completion has become a common practice for programmers during their daily programming activities. It automatically predicts the next tokens or statements that the programmers may use. Code completion aims to substantially save keystrokes and improve the programming efficiency for programmers. Although there exists substantial research on code completion, it is still unclear what practitioner expectations are on code completion and whether these expectations are met by the existing research. To address these questions, we perform a study by first interviewing 15 professionals and then surveying 599 practitioners from 18 IT companies about their expectations on code completion. We then compare the practitioner expectations with the existing research by conducting a literature review of papers on code completion published in major publication venues from 2012 to 2022. Based on the comparison, we highlight the directions desirable for researchers to invest efforts toward developing code completion techniques for meeting practitioner expectations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
RLCoder: Reinforcement Learning for Repository-Level Code Completion
Yanlin Wang,Yanli Wang,Daya Guo,Jiachi Chen,Ruikai Zhang,Yuchi Ma,Zibin Zheng +6 more
- 28 Jul 2024
TL;DR: RLCoder proposes a reinforcement learning framework for repository-level code completion, enabling the retriever to learn from unlabeled data and iteratively improve its ability to retrieve relevant content, outperforming state-of-the-art methods by 12.2% EM improvement.
Exploring Multi-Lingual Bias of Large Code Models in Code Generation
Chaozheng Wang,Zongjie Li,Cuiyun Gao,Wenxuan Wang,Ting Peng,Hailiang Huang,Yuetang Deng,Shuai Wang,Michael R. Lyu +8 more
- 30 Apr 2024
TL;DR: Multi-lingual bias exists in large code models, impacting code generation performance across different languages and natural languages.
Using AI-based coding assistants in practice: State of affairs, perceptions, and ways forward
Agnia Sergeyuk,Yaroslav Golubev,Timofey Bryksin,Iftekhar Ahmed +3 more
On the Concerns of Developers When Using GitHub Copilot
Xiyu Zhou,Peng Liang,Beiqi Zhang,Zengyang Li,Aakash Ahmad,Mojtaba Shahin,Muhammad Waseem +6 more
TL;DR: An empirical study digs into the main challenges users encounter when implementing Copilot in practical development, the possible impact of Copilot on the coding process, aspects in which Copilot can be further enhanced, and potential new features desired by Copilot users.
Using AI-Based Coding Assistants in Practice: State of Affairs, Perceptions, and Ways Forward
Agnia Sergeyuk,Yaroslav Golubev,Timofey Bryksin,Iftekhar Ahmed +3 more
- 11 Jun 2024
TL;DR: AI-based coding assistants are used differently across various software development activities and stages. Usage varies based on activity and stage, with writing tests and natural-language artifacts being the least enjoyable activities. Currently, assistants are mainly used for generating tests and test data, comments, and docstrings. There are fixable issues that need to be addressed to improve assistant usage.
References
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
•Proceedings Article
Language Models are Few-Shot Learners
Tom B. Brown,Benjamin Mann,Nick Ryder,Melanie Subbiah,Jared Kaplan,Prafulla Dhariwal,Arvind Neelakantan,Pranav Shyam,Girish Sastry,Amanda Askell,Sandhini Agarwal,Ariel Herbert-Voss,Gretchen Krueger,Thomas Henighan,Rewon Child,Aditya Ramesh,Daniel M. Ziegler,Jeffrey Wu,Clemens Winter,Christopher Hesse,Mark Chen,Eric Sigler,Mateusz Litwin,Scott Gray,Benjamin Chess,Jack Clark,Christopher Berner,Samuel McCandlish,Alec Radford,Ilya Sutskever,Dario Amodei +30 more
- 28 May 2020
TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network
TL;DR: A Machine Learning practitioner seeking guidance for implementing the new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well.
3.9K
Code completion with statistical language models
Veselin Raychev,Martin Vechev,Eran Yahav +2 more
- 09 Jun 2014
TL;DR: The main idea is to reduce the problem of code completion to a natural-language processing problem of predicting probabilities of sentences, and design a simple and scalable static analysis that extracts sequences of method calls from a large codebase, and index these into a statistical language model.