Automatic Code Documentation Generation Using GPT-3

doi:10.1145/3551349.3559548

Open AccessProceedings Article10.1145/3551349.3559548

Automatic Code Documentation Generation Using GPT-3

- 06 Sep 2022

78

TL;DR: Codec is a GPT-3 based model pre-trained on both natural and programming languages that outperforms existing techniques even with basic settings like one-shot learning and achieves an overall BLEU score of 20.6 for six different programming languages.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Table 2: Results on documentation generation (BLEU score)

Figure 3: Examples of documentation by Codex (1-shot)

Table 1: Statistics of CodeSearchNet [25]

Figure 2: Sample prompt format for one-shot learning

Figure 1: A schematic overview of our study

Citations

Journal Article•10.2139/ssrn.4593895

A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

Katikapalli Subramanyam Kalyan

- 04 Oct 2023

- Social Science Research Network

TL;DR: A comprehensive survey which summarizes the recent research progress in multiple dimensions related to GPT-3 family large language models and discusses the performances of GLLMs in various downstream tasks, specific domains and multiple languages.

...read moreread less

89

•Proceedings Article•10.1145/3544548.3580817

“What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models

Michael Xieyang Liu, +6 more

- 13 Apr 2023

TL;DR: In this paper , the authors propose grounded abstraction matching, which bridges the abstraction gap by translating the code back into a systematic and predictable naturalistic utterance, which improves end-users' understanding of the scope and capabilities of the code-generating model, and the kind of language needed to use it effectively.

...read moreread less

82

Journal Article•10.1016/j.nlp.2023.100048

A survey of GPT-3 family large language models including ChatGPT and GPT-4

Katikapalli Subramanyam Kalyan

- 01 Mar 2024

- Natural Language Processing Journal

TL;DR: A survey of GPT-3 family large language models including ChatGPT and GPT-4 summarizes recent research progress in large language models and provides future research directions.

...read moreread less

80

Journal Article•10.48550/arxiv.2312.10868

From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape

Timothy R. McIntosh, +4 more

- 18 Dec 2023

- arXiv.org

TL;DR: The study highlighted the importance of incorporating ethical and human-centric methods in AI development, ensuring alignment with societal norms and welfare, and outlined a strategy for future AI research that focuses on a balanced and conscientious use of MoE, multimodality, and AGI in generative AI.

...read moreread less

55

Journal Article•10.48550/arxiv.2308.11396

Towards an Understanding of Large Language Models in Software Engineering Tasks

Zibin Zheng, +6 more

- 22 Aug 2023

- arXiv.org

TL;DR: This paper is the first to comprehensively investigate and collate the research and products combining LLMs with software engineering, aiming to answer two questions: (1) What are the current integrations of LLMsWith software engineering?

...read moreread less

36

...

Expand

References

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019

- arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

26.2K

•Proceedings Article

Language Models are Few-Shot Learners

Tom B. Brown, +30 more

- 28 May 2020

TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

...read moreread less

25.2K

•Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 08 Dec 2014

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

20.1K