Open AccessPosted Content
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu,Daya Guo,Shuo Ren,Junjie Huang,Alexey Svyatkovskiy,Ambrosio Blanco,Colin B. Clement,Dawn Drain,Daxin Jiang,Duyu Tang,Ge Li,Lidong Zhou,Linjun Shou,Long Zhou,Michele Tufano,Ming Gong,Ming Zhou,Nan Duan,Neel Sundaresan,Shao Kun Deng,Fu Shengyu,Shujie Liu +21 more
TL;DR: CodeXGLUE as mentioned in this paper is a benchmark dataset to foster machine learning research for program understanding and generation, which includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison.
read more
Abstract: Benchmark datasets have a significant impact on accelerating research in programming language tasks In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison CodeXGLUE also features three baseline systems, including the BERT-style, GPT-style, and Encoder-Decoder models, to make it easy for researchers to use the platform The availability of such data and baselines can help the development and validation of new methods that can be applied to various program understanding and generation problems
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
WILDS: A Benchmark of in-the-Wild Distribution Shifts
Pang Wei Koh,Shiori Sagawa,Henrik Marklund,Sang Michael Xie,Marvin Zhang,Akshay Balsubramani,Weihua Hu,Michihiro Yasunaga,Richard Lanas Phillips,Irena Gao,Tony Lee,Etienne David,Ian Stavness,Wei Guo,Berton A. Earnshaw,Imran S. Haque,Sara Beery,Jure Leskovec,Anshul Kundaje,Emma Pierson,Sergey Levine,Chelsea Finn,Percy Liang +22 more
TL;DR: WILDS is presented, a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, and is hoped to encourage the development of general-purpose methods that are anchored to real-world distribution shifts and that work well across different applications and problem settings.
1K
•Posted Content
Evaluating Large Language Models Trained on Code
Mark Chen,Jerry Tworek,Heewoo Jun,Qiming Yuan,Henrique Ponde de Oliveira Pinto,Jared Kaplan,Harrison Edwards,Yuri Burda,Nicholas Joseph,Greg Brockman,Alex Ray,Raul Puri,Gretchen Krueger,Michael Petrov,Heidy Khlaaf,Girish Sastry,Pamela Mishkin,Brooke Chan,Scott Gray,Nick Ryder,Mikhail Pavlov,Alethea Power,Lukasz Kaiser,Mohammad Bavarian,Clemens Winter,Philippe Tillet,Felipe Petroski Such,Dave Cummings,Matthias Plappert,Fotios Chantzis,Elizabeth A. Barnes,Ariel Herbert-Voss,William H. Guss,Alex Nichol,Alex Paino,Nikolas Tezak,Jie Tang,Igor Babuschkin,Suchir Balaji,Shantanu Jain,William Saunders,Christopher Hesse,Andrew N. Carr,Jan Leike,Joshua Achiam,Vedant Misra,Evan Morikawa,Alec Radford,Matthew M. Knight,Miles Brundage,Mira Murati,Katie Mayer,Peter Welinder,Bob McGrew,Dario Amodei,Samuel McCandlish,Ilya Sutskever,Wojciech Zaremba +57 more
TL;DR: Codex as discussed by the authors is a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities, showing that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts.
1K
Unified Pre-training for Program Understanding and Generation
Wasi Uddin Ahmad,Saikat Chakraborty,Baishakhi Ray,Kai-Wei Chang +3 more
- 01 Jun 2021
TL;DR: Analysis reveals that PLBART learns program syntax, style, logical flow, and style that are crucial to program semantics and thus excels even with limited annotations, and outperforms or rivals state-of-the-art models.
•Posted Content
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
TL;DR: CodeT5 as discussed by the authors proposes a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers, and employs a unified framework to seamlessly support both code understanding and generation tasks and allows for multi-task learning.
607
•Posted Content
Generalizing to Unseen Domains: A Survey on Domain Generalization
TL;DR: Domain generalization (DG) deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain this article.
449
References
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Bleu: a Method for Automatic Evaluation of Machine Translation
Kishore Papineni,Salim Roukos,Todd Ward,Wei-Jing Zhu +3 more
- 06 Jul 2002
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.