Journal Article10.1007/S10664-016-9429-5
Characterizing logging practices in Java-based open source software projects --- a replication study in Apache Software Foundation
Boyuan Chen,Zhen Ming Jiang +1 more
129
TL;DR: A replication study of 21 different Java-based open source projects from three different categories shows that all projects contain logging code, which is actively maintained, however, contrary to the original study, bug reports containing log messages take a longer time to resolve than bug reports without log messages.
read more
Abstract: Log messages, which are generated by the debug statements that developers insert into the code at runtime, contain rich information about the runtime behavior of software systems. Log messages are used widely for system monitoring, problem diagnoses and legal compliances. Yuan et al. performed the first empirical study on the logging practices in open source software systems. They studied the development history of four C/C++ server-side projects and derived ten interesting findings. In this paper, we have performed a replication study in order to assess whether their findings would be applicable to Java projects in Apache Software Foundations. We examined 21 different Java-based open source projects from three different categories: server-side, client-side and supporting-component. Similar to the original study, our results show that all projects contain logging code, which is actively maintained. However, contrary to the original study, bug reports containing log messages take a longer time to resolve than bug reports without log messages. A significantly higher portion of log updates are for enhancing the quality of logs (e.g., formatting & style changes and spelling/grammar fixes) rather than co-changes with feature implementations (e.g., updating variable names).
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Characterizing the natural language descriptions in software logging statements
Pinjia He,Zhuangbin Chen,Shilin He,Michael R. Lyu +3 more
- 03 Sep 2018
TL;DR: This paper systematically study what developers log, with focus on the usage of natural language descriptions in logging statements, and demonstrates the potential of automated description text generation for logging statements by obtaining up to 49.04 BLEU-4 score and 62.1 ROUGE-L score.
102
A Qualitative Study of the Benefits and Costs of Logging from Developers' Perspectives
TL;DR: A qualitative study that combines a survey of 66 developers and a case study of 223 logging-related issue reports draws a comprehensive picture of the benefits and costs of logging from developers’ perspectives.
89
Which Variables Should I Log
TL;DR: The approach first leverages a neural network with an RNN (recurrent neural network) layer and a self-attention layer to learn the proper representation of each program token, and then predicts whether each token should be logged through a unified binary classifier based on the learned representation.
76
A systematic literature review on automated log abstraction techniques
TL;DR: A quality model is built composed of seven desirable aspects of automated log-abstraction techniques that support software engineers in understanding the advantages and limitations of existing techniques and in choosing the suitable technique to their unique use cases.
75
Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics
Jieming Zhu,Shilin He,Pinjia He,Jinyang Liu,Michael R. Lyu +4 more
- 14 Aug 2020
TL;DR: Lghub provides 19 real-world log datasets collected from a wide range of software systems, including distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software.
65
References
•Book
Data Mining: Concepts and Techniques
Jiawei Han,Micheline Kamber,Jian Pei +2 more
- 08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Estimating the reproducibility of psychological science
Alexander A. Aarts,Joanna E. Anderson,Christopher J. Anderson,Peter Raymond Attridge,Peter Raymond Attridge,Angela S. Attwood,Jordan Axt,Molly Babel,Štěpán Bahník,Erica Baranski,Michael Barnett-Cowan,Elizabeth Bartmess,Jennifer S. Beer,Raoul Bell,Heather Bentley,Leah Beyan,Grace Binion,Grace Binion,Denny Borsboom,Annick Bosch,Frank A. Bosco,Sara Bowman,Mark J. Brandt,Erin L Braswell,Hilmar Brohmer,Benjamin T. Brown,Kristina G. Brown,Jovita Brüning,Jovita Brüning,Ann Calhoun-Sauls,Shannon P. Callahan,Elizabeth Chagnon,Jesse Chandler,Jesse Chandler,Christopher R. Chartier,Felix Cheung,Felix Cheung,Cody D. Christopherson,Linda Cillessen,Russ Clay,Hayley M. D. Cleary,Mark D. Cloud,Michael Conn,Johanna Cohoon,Simon Columbus,Andreas Cordes,Giulio Costantini,Leslie Cramblet Alvarez,Ed Cremata,Jan Crusius,Jamie DeCoster,Michelle A. DeGaetano,Nicolás Delia Penna,Bobby Den Bezemer,Marie K. Deserno,Olivia Devitt,Laura Dewitte,David G. Dobolyi,Geneva T. Dodson,M. Brent Donnellan,Ryan Donohue,Rebecca A. Dore,Angela Rachael Dorrough,Angela Rachael Dorrough,Anna Dreber,Michelle Dugas,Elizabeth W. Dunn,Kayleigh E Easey,Sylvia Eboigbe,Casey Eggleston,Jo Embley,Sacha Epskamp,Timothy M. Errington,Vivien Estel,Frank J. Farach,Jenelle Feather,Anna Fedor,Belén Fernández-Castilla,Susann Fiedler,James G. Field,Stanka A. Fitneva,Taru Flagan,Amanda L. Forest,Eskil Forsell,Joshua D. Foster,Michael C. Frank,Rebecca S. Frazier,Heather M. Fuchs,Philip A. Gable,Jeff Galak,Elisa Maria Galliani,Anup Gampa,Sara García,Douglas Gazarian,Elizabeth Gilbert,Roger Giner-Sorolla,Andreas Glöckner,Andreas Glöckner,Lars Goellner,Jin X. Goh,Rebecca M. Goldberg,Patrick T. Goodbourn,Shauna Gordon-McKeon,Bryan Gorges,Jessie Gorges,Justin Goss,Jesse Graham,James A. Grange,Jeremy R. Gray,Chris H.J. Hartgerink,Joshua K. Hartshorne,Fred Hasselman,Timothy Hayes,Emma Heikensten,Felix Henninger,Felix Henninger,John Hodsoll,Taylor Holubar,Gea Hoogendoorn,Denise J. Humphries,Cathy On-Ying Hung,Nathali Immelman,Vanessa C. Irsik,Georg Jahn,Frank Jäkel,Marc Jekel,Magnus Johannesson,Larissa Gabrielle Johnson,David J. Johnson,Kate M. Johnson,William J. Johnston,Kai J. Jonas,Jennifer A. Joy-Gaba,Heather Barry Kappes,Kim Kelso,Mallory C. Kidwell,Seung K. Kim,Matthew W. Kirkhart,Bennett Kleinberg,Bennett Kleinberg,Goran Knežević,Franziska Maria Kolorz,Jolanda J. Kossakowski,Robert Krause,Job Krijnen,Tim Kuhlmann,Yoram K. Kunkels,Megan M. Kyc,Calvin K. Lai,Aamir Laique,Daniel Lakens,Kristin A. Lane,Bethany Lassetter,Ljiljana B. Lazarević,Etienne P. Le Bel,Key Jung Lee,Minha Lee,Kristi M. Lemm,Carmel A. Levitan,Melissa Lewis,Lin Lin,Stephanie C. Lin,Matthias Lippold,Darren Loureiro,Ilse Luteijn,Sean P. Mackinnon,Heather N. Mainard,Denise C. Marigold,Daniel P. Martin,Tylar Martinez,E. J. Masicampo,Joshua J. Matacotta,Maya B. Mathur,Michael May,Michael May,Nicole Mechin,Pranjal H. Mehta,Johannes M. Meixner,Johannes M. Meixner,Alissa Melinger,Jeremy K. Miller,Mallorie Miller,Katherine Moore,Katherine Moore,Marcus Möschl,Matt Motyl,Stephanie M. Müller,Marcus R. Munafò,Koen Ilja Neijenhuijs,Taylor Nervi,Gandalf Nicolas,Gustav Nilsonne,Gustav Nilsonne,Brian A. Nosek,Brian A. Nosek,Michèle B. Nuijten,Catherine Olsson,Catherine Olsson,Colleen Osborne,Lutz Ostkamp,Misha Pavel,Ian S. Penton-Voak,Olivia Perna,Cyril Pernet,Marco Perugini,R. Nathan Pipitone,Michael C. Pitts,Franziska Plessow,Franziska Plessow,Jason M. Prenoveau,Rima-Maria Rahal,Rima-Maria Rahal,Kate A. Ratliff,David Reinhard,Frank Renkewitz,Ashley A. Ricker,Anastasia E. Rigney,Andrew M Rivers,Mark A. Roebke,Abraham M. Rutchick,Robert S. Ryan,Onur Sahin,Anondah R. Saide,Gillian M. Sandstrom,David Santos,David Santos,Rebecca Saxe,René Schlegelmilch,René Schlegelmilch,Kathleen Schmidt,Sabine Scholz,Larissa Seibel,Dylan Selterman,Samuel Shaki,William B. Simpson,H. Colleen Sinclair,Jeanine L. M. Skorinko,Agnieszka Slowik,Joel S. Snyder,Courtney K. Soderberg,Carina Sonnleitner,Nick Spencer,Jeffrey R. Spies,Sara Steegen,Stefan Stieger,Nina Strohminger,Gavin Brent Sullivan,Thomas Talhelm,Megan Tapia,Anniek M. te Dorsthorst,Manuela Thomae,Manuela Thomae,Sarah L. Thomas,Pia Tio,Frits Traets,Steve N.H. Tsang,Francis Tuerlinckx,Paul J. Turchan,Milan Valášek,Anna E. Van't Veer,Robbie C. M. van Aert,Marcel A.L.M. van Assen,Riet van Bork,Mathijs Van De Ven,Don van den Bergh,Marije van der Hulst,Roel van Dooren,Johnny van Doorn,Daan R. van Renswoude,Hedderik van Rijn,Wolf Vanpaemel,Alejandro Vásquez Echeverría,Melissa Vazquez,Natalia Vélez,Marieke Vermue,Mark Verschoor,Michelangelo Vianello,Martin Voracek,Gina Vuu,Eric-Jan Wagenmakers,Joanneke Weerdmeester,Ashlee Welsh,Erin C. Westgate,Joeri Wissink,Michael J. Wood,Andy T. Woods,Andy T. Woods,Emily M. Wright,Sining Wu,Marcel Zeelenberg,Kellylynn Zuni +290 more
TL;DR: A large-scale assessment suggests that experimental reproducibility in psychology leaves a lot to be desired, and correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
Data Mining: Concepts and Techniques
G. Thamaraiselvi,A. Kaliammal +1 more
TL;DR: This article explains What is data mining?
4.4K
Two case studies of open source software development: Apache and Mozilla
TL;DR: This work examines data from two major open source projects, the Apache web server and the Mozilla browser, and quantifies aspects of developer participation, core team size, code ownership, productivity, defect density, and problem resolution intervals for these OSS projects.
1.9K
Detecting large-scale system problems by mining console logs
Wei Xu,Ling Huang,Armando Fox,David A. Patterson,Michael I. Jordan +4 more
- 11 Oct 2009
TL;DR: In this article, a general methodology to mine this rich source of information to automatically detect system runtime problems was proposed, combining source code analysis with information retrieval to create composite features and then analyze these features using machine learning to detect operational problems.
1K