Analyzing the effects of test driven development in GitHub

Neil C. Borle, Meysam Feghhi, Eleni Stroulia, Russell Greiner, Abram Hindle

2017/11/01

Analyzing the effects of test driven development in GitHub

Authors

Neil C. Borle, Meysam Feghhi, Eleni Stroulia, Russell Greiner, Abram Hindle

Venue

Abstract

Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To that end, we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all GitHub repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, our results do not reveal any observable benefits from using TDD.

Bibtex

@article{borle2017EMSE-TDD,
 abstract = {Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To that end, we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all GitHub repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, our results do not reveal any observable benefits from using TDD.},
 accepted = {2017-11-01},
 author = {Neil C. Borle and Meysam Feghhi and Eleni Stroulia and Russell Greiner and Abram Hindle},
 authors = {Neil C. Borle, Meysam Feghhi, Eleni Stroulia, Russell Greiner, Abram Hindle},
 code = {borle2017EMSE-TDD},
 day = {25},
 doi = {10.1007/s10664-017-9576-3},
 funding = {NSERC Discovery},
 issn = {1573-7616},
 journal = {Empirical Software Engineering},
 journalid = {EMSE-D-17-00057R2},
 month = {Nov},
 pagerange = {1--28},
 pages = {1--28},
 published = {2017-11-25},
 role = { Instructor / Co-author},
 title = {Analyzing the effects of test driven development in GitHub},
 type = {article},
 url = {http://softwareprocess.ca/pubs/borle2017EMSE-TDD.pdf},
 venue = {Empirical Software Engineering},
 year = {2017}
}