In an attempt to find data backing TDD, I came across this paper by Microsoft Research, IBM, and North Carolina State University:
Realizing quality improvement through test driven development: results and experiences of four industrial teams
Below you can find some of the most interesting quotes from this paper.
Test-driven development (TDD) (Beck 2003) is an “opportunistic” (Curtis 1989) software development practice that has been used sporadically for decades (Larman and Basili 2003; Gelperin and Hetzel 1987). With this practice, a software engineer cycles minute-by-minute between writing failing unit tests and writing implementation code to pass those tests. […] However, little empirical evidence supports or refutes the utility of this practice in an industrial context.
we do investigate TDD within the prevailing development processes at IBM and Microsoft and not within the context of XP.
[…] maintenance fixes and “small” code changes may be nearly 40 times more error prone than new development (Humphrey 1989), and often, new faults are injected during the debugging and maintenance phases. The ease
of running the automated test cases after changes are made should also enable smooth integration of new functionality into the code base and therefore reduce the likelihood that fixes and maintenance changes introduce new defects. The TDD test cases are essentially a high-granularity, low-level, regression test.
Erdogmus et al. (2005) performed a controlled investigation regarding test-first and test-last programming using 24 undergraduate computer science students. They observed that TDD improved programmer productivity but did not, on average, help the engineers to achieve a higher quality product. Their conclusions also brought out a valid point that the effectiveness of the testfirst technique depends on the ability to encourage programmers to enhance their code with test assets.
Müller and Tichy (2001) investigated XP in a university context using 11 students. From a testing perspective they observed that, in the final review of the course, 87% of the students stated that the execution of the test cases strengthened their confidence in their code.
Janzen and Seiedian (2006) conducted an experiment with undergraduate students in a software engineering course. Students in three groups completed semester-long programming projects using either an iterative test-first (TDD), iterative test-last, or linear test-last approach. Results from this study indicate that TDD can be an effective software design approach improving both code-centric aspects such as object decomposition, test coverage, and external quality, as well as developer-centric aspects, which includes productivity and confidence.
Unit testing followed as a post-coding activity. In all cases, the unit test process was not formal and was not disciplined. More often than not, there were resource and schedule limitations that constrained the number of test cases developed and run.
With the TDD group, test cases were developed mostly up front as a means of reducing ambiguity and to validate the requirements, which for this team was a full detail standard specification. UML class and sequence diagrams were used to develop an initial design. This design activity was interspersed with the up-front unit test creations for developed classes. Complete unit testing was enforced—primarily via reminders and encouragements. We define complete testing as ensuring that the public interfaces and semantics of each method (the behavior of the method as defined by the specification) were tested utilizing the Junit unit-testing framework. For each public class, there was an associated public test class; for each public method in the class there was an associated public test method in the corresponding unit test class. The target goal was to cover at least 80% of the developed classes by automated unit testing.
[…] to guarantee that all unit tests would be run by all members of the team, an automated build and test systems was set up in both geographical locations. Daily, the build systems extracted all the code from the library build the code and ran all the unit tests. […] After each automated build and test run cycle, an email was sent to all members of the
teams listing all the tests that successfully ran as well as any errors found. This automated build and test served as a daily integration and validation heartbeat for the team.
The TDD team at Microsoft did most of their development using a hybrid version of TDD. By hybrid we mean that these projects as with almost all projects at Microsoft had detailed requirements documents written. These detailed requirements documents drove the test and development effort. There were also design meetings and review sessions. This explains our reason to call this a hybrid-TDD approach, as agile teams typically do not have design review meetings.
Quality and Productivity Results
We measure the quality of the software products in terms of defect density computed as defects/thousand lines of code (KLOC).
“When software is being developed, a person makes an error that results in a physical
fault (or defect) in a software element. When this element is executed, traversal of the
fault or defect may put the element (or system) into an erroneous state. When this
erroneous state results in an externally visible anomaly, we say that a failure has
occurred” (IEEE 1988).
All the teams demonstrated a significant drop in defect density: 40% for the IBM team; 60–90% for the Microsoft teams.
Another interesting observation from the outcome measures in Table 3 is the increase in time to develop the features attributed to the usage of the TDD practice, as subjectively estimated by management. The increase in development time ranges from 15% to 35%. From an efficacy perspective this increase in development time is offset by the by the reduced maintenance costs due to the improvement in quality (Erdogmus and Williams 2003), an observation that was backed up the product teams at Microsoft and IBM.
– Start TDD from the beginning of projects. Do not stop in the middle and claim it doesn’t work. Do not start TDD late in the project cycle when the design has already been decided and majority of the code has been written. TDD is best done incrementally and continuously.
– For a team new to TDD, introduce automated build test integration towards the second third of the development phase—not too early but not too late. If this is a “Greenfield” project, adding the automated build test towards the second third of the development schedule allows the team to adjust to and become familiar with TDD. Prior to the automated build test integration, each developer should run all the test cases on their own machine.
– Convince the development team to add new tests every time a problem is found, no matter when the problem is found. By doing so, the unit test suites improve during the development and test phases.
– Get the test team involved and knowledgeable about the TDD approach. The test team should not accept new development release if the unit tests are failing.
– Hold a thorough review of an initial unit test plan, setting an ambitious goal of having the highest possible (agreed upon) code coverage targets.
– Constantly running the unit tests cases in a daily automatic build (or continuous integration); tests run should become the heartbeat of the system as well as a means to track progress of the development. This also gives a level of confidence to the team when new features are added.
– Encourage fast unit test execution and efficient unit test design. Test execution speed is very important since when all the tests are integrated, the complete execution can become quite long for a reasonably-sized project and when using constant test executions. Tests results are important early and often; they provide feedback on the current state of the system. Further, the faster the execution of the tests the more likely developers themselves will run the tests without waiting for the automated build tests results. Such constant execution of tests by developers may also result in faster unit tests additions and fixes.
– Share unit tests. Developers’ sharing their unit tests, as an essential practice of TDD, helps identify integration issues early on.
– Track the project using measurements. Count the number of test cases, code coverage, bugs found and fixed, source code count, test code count, and trend across time, to identify problems and to determine if TDD is working for you.
– Check morale of the team at the beginning and end of the project. Conduct periodical and informal surveys to gauge developers’ opinions on the TDD process and on their willingness to apply it in the future.