StackOverflow and GitHub: Associations between Software Development and Crowdsourced Knowledge
Bogdan Vasilescu,Vladimir Filkov,Alexander Serebrenik +2 more
- 08 Sep 2013
- pp 188-195
TL;DR: This paper investigates the interplay between Stack Overflow activities and the development process, reflected by code changes committed to the largest social coding repository, GitHub, and shows that active GitHub committers ask fewer questions and provide more answers than others.
read more
Abstract: Stack Overflow is a popular on-line programming question and answer community providing its participants with rapid access to knowledge and expertise of their peers, especially benefitting coders Despite the popularity of Stack Overflow, its role in the work cycle of open-source developers is yet to be understood: on the one hand, participation in it has the potential to increase the knowledge of individual developers thus improving and speeding up the development process On the other hand, participation in Stack Overflow may interrupt the regular working rhythm of the developer, hence also possibly slow down the development process In this paper we investigate the interplay between Stack Overflow activities and the development process, reflected by code changes committed to the largest social coding repository, GitHub Our study shows that active GitHub committers ask fewer questions and provide more answers than others Moreover, we observe that active Stack Overflow askers distribute their work in a less uniform way than developers that do not ask questions Finally, we show that despite the interruptions incurred, the Stack Overflow activity rate correlates with the code changing activity in GitHub
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Fig. 4: The steps to generate a simulated time-series of StackOverflow activities. 
TABLE II: Mutual influence of StackOverflow activities and GitHub committing, for different committers (from least active Q1, to most active Q4). 
TABLE III: Mutual influence of StackOverflow activities and GitHub committing, for different answerers (from least active Q1, to most active Q4). Individuals in Q1 do not give answers. 
Fig. 1: Number of commits per day in the GitHub dataset between July 2011 and April 2012. 
TABLE I: Sizes of the original and intersection datasets. 
Fig. 2: Example T̃-graph.
Citations
A survey of the use of crowdsourcing in software engineering
TL;DR: A comprehensive survey of the use of crowdsourcing in software engineering, seeking to cover all literature on this topic, and exposing trends, open issues and opportunities for future research on Crowdsourced Software Engineering.
436
Gender and Tenure Diversity in GitHub Teams
Bogdan Vasilescu,Daryl Posnett,Baishakhi Ray,Mark van den Brand,Alexander Serebrenik,Premkumar Devanbu,Vladimir Filkov +6 more
- 18 Apr 2015
TL;DR: Using GitHub, the largest publicly available collection of OSS projects, it is shown that both gender and tenure diversity are positive and significant predictors of productivity, together explaining a sizable fraction of the data variability.
What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform
TL;DR: A throughout study on the meaning, characteristics, and dynamic growth of GitHub stars is provided and a list of recommendations to open source project managers and GitHub users and Software Engineering researchers is provided.
247
Security and emotion: sentiment analysis of security discussions on GitHub
Daniel Pletea,Bogdan Vasilescu,Alexander Serebrenik +2 more
- 31 May 2014
TL;DR: The findings confirm the importance of properly training developers to address security concerns in their applications as well as the need to test applications thoroughly for security vulnerabilities in order to reduce frustration and improve overall project atmosphere.
What's in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform
TL;DR: In this paper, the authors provide a study on the meaning, characteristics, and dynamic growth of GitHub stars and propose four patterns to describe stars growth, which are derived after clustering the time series representing the number of stars of the studied repositories.
215
References
The origin of bursts and heavy tails in human dynamics
TL;DR: It is shown that the bursty nature of human behaviour is a consequence of a decision-based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, with most tasks being rapidly executed, whereas a few experience very long waiting times.
2.4K
Social coding in GitHub: transparency and collaboration in an open software repository
Laura Dabbish,Colleen Stuart,Jason Tsay,James D. Herbsleb +3 more
- 11 Feb 2012
TL;DR: It is found that people make a surprisingly rich set of social inferences from the networked activity information in GitHub, such as inferring someone else's technical goals and vision when they edit code, or guessing which of several similar projects has the best chance of thriving in the long term.
Gamification: designing for motivation
TL;DR: Social Mediator is a forum exploring the ways that HCI research and principles interact---or might interact---with practices in the social media world.
754
Mining email social networks
Christian Bird,Alex Gourley,Prem Devanbu,Michael Gertz,Anand Swaminathan +4 more
- 22 May 2006
TL;DR: This paper begins with a discussion of the infrastructure (including a novel use of Scientific Workflow software) and then discusses the approach to mining the email archives, and presents some preliminary results from the data analysis.
Design lessons from the fastest q&a site in the west
Lena Mamykina,Bella Manoim,Manas Mittal,George Hripcsak,Björn Hartmann +4 more
- 07 May 2011
TL;DR: This paper analyzes a Question & Answer site for programmers, Stack Overflow, that dramatically improves on the utility and performance of Q&A systems for technical domains and argues that it is not primarily due to an a priori superior technical design, but also to the high visibility and daily involvement of the design team within the community they serve.