A Field Study of Technical Debt
In their haste to deliver software capabilities, developers sometimes engage in less-than-optimal coding practices. If not addressed, these shortcuts can ultimately yield unexpected rework costs that offset the benefits of rapid delivery. Technical debt conceptualizes the tradeoff between the short-term benefits of rapid delivery and long-term value. Taking shortcuts to expedite the delivery of features in the short term incurs technical debt, analogous to financial debt, that must be paid off later to optimize long-term success. Managing technical debt is an increasingly critical aspect of producing cost-effective, timely, and high-quality software products, especially in projects that apply agile methods.
A delicate balance is needed between the desire to release new software features rapidly to satisfy users and the desire to practice sound software engineering that reduces rework. Too often, however, technical debt focuses on coding issues when a broader perspective--one that incorporates software architectural concerns--is needed. This blog post, the first in a series, highlights the findings of a recent field study to assess the state of the practice and current thinking regarding technical debt and guide the development of a technical debt timeline.
The Technical Debt Metaphor
The technical debt metaphor, first introduced by Ward Cunningham in 1992, refers to the degraded quality resulting from overly hasty delivery of software capabilities to users. As my colleague Ipek Ozkaya explained at the 2012 Agile Research Forum, "A little debt speeds up development, and can be beneficial as long as the debt is paid back promptly with a rewrite that reduces complexity and streamlines future enhancements."
At the SEI, our working definition is taken from Steve McConnell:
A design or construction approach that is expedient in the short term but that creates a technical context in which the same work will cost more to do later than it would cost to do now (including increased cost over time).
The SEI Architecture Practices team has been one of the pioneers in advancing the research agenda regarding technical debt. In addition to our ongoing research and industry work, we have also helped to organize the international Managing Technical Debt workshop series. Our early efforts have focused on providing software engineers visibility into technical debt from strategic and architectural perspectives. Our ongoing efforts focus on developing tools and practices for providing comprehensive technical debt detection and visualization for developers, architects and business stakeholders.
One question our work in this field has raised is whether there are practices that move this metaphor beyond a mere communication mechanism. The metaphor is attractive to practitioners because it communicates the idea that if quality problems are not addressed, things may get worse. Is there more to it than that?
Existing studies of technical debt have largely focused on code metrics and small surveys of developers. Practitioners currently broadly define technical debt as a "shortcut for expediency" and more specifically, bad code or inadequate refactoring. The initial definition, from Ward Cunningham, referred to the debt incurred because "first-time code" would ship with a limited understanding of the true nature of the problem. But is there more to technical debt than bad code?
This blog post reports on our survey of 1831 participants, primarily software engineers and architects working in long-lived, software-intensive projects from three large organizations and follow-up interviews of seven of those software engineers.
Approach and Demographics
We piloted and then released a survey consisting of approximately 20 questions. You can find our survey instrument here. The seven follow-up interviews took 45 minutes each. We used coding, a well-developed qualitative research technique for categorizing concepts in text, to classify open-ended answers. Some details about our survey include the following:
- Respondents had on average 6 or more years' experience (one third of whom had more than 15 years).
- Roles selected included developers (42 percent), system engineers (7 percent), QA/testers (7 percent), project leads/managers (32 percent), architects (7 percent) and other (6 percent).
- There were 39 separate business units represented among the three companies, covering a broad set of domains, from scientific computing, to command and control, to business information systems, to embedded software.
- Most projects were web systems (24 percent) or embedded systems (31 percent).
- Projects generally consisted of 10 to 20 people, although 32 percent had fewer than 9 staff (including contractors and business staff).
- The systems averaged 3 to 5 years old, but a significant number (29 percent) were more than 10 years old.
- The systems were typically between 100 KLOC and 1MLOC in size.
- Most respondents used Scrum (33 percent) or incremental development methods (20 percent), but some were using self-admitted waterfall (15 percent) and some had no methodology (17 percent).
The remainder of this post details our three research questions, the motivations behind each, and what our team learned from the responses.
First Question: Usefulness of the Metaphor
Our first research question asked
Is there a commonly shared definition of technical debt among professional software engineers?
Our team selected this question primarily because the practice of managing technical debt is still in its infancy. Too often, software developers will approach a manager and indicate that technical debt has been incurred and ask for money to fix it. Our results confirm the widely held belief that neither developers nor their managers share a clear understanding of exactly what is meant by the metaphor and what it means for their project. The exception is a shared understanding that poor architectural choices may generate technical debt.
We asked participants to rank statements using a five-point scale from strongly disagree to strongly agree. You can see some of those statements in the figure below: 79 percent agree or strongly agree that "lack of awareness is a problem" and 71 percent that "technical debt implies dealing with both principal and interest." These responses suggest that there is widespread agreement on high-level aspects of the technical debt metaphor, including some popular financial extensions of the metaphor, such as the notion of principal, interest, and the need for payback.
Perceptions of Technical Debt
As the figure below demonstrates, some of the most commonly-occurring concepts (such as Awareness, Interest, and Time Pressure) on the open-ended questions were similarly high-level. For example, we assigned the concept Interest to the definition "extra effort in projects which is not required for purely technical reasons." These abstract concepts lack the detail for delineating the source of technical debt from the causes and consequences. Less common were answers pointing to the source such as "Code that has been incrementally developed over the years that is now so complicated ..." or "bugs and crash-downs."
Coding Frequency for Open-Ended Questions
Our survey responses and follow-up interviews revealed that architecture was commonly seen as a major source of technical debt, which informed our second survey question.
Second Question: Architecture Choices
Our second research question asked
Are issues with architectural elements (such as module dependencies, external dependencies, external team dependencies, architecture decisions) among the most significant sources of technical debt?
Throughout our research into technical debt, we have seen multiple instances where the management of technical debt needed to extend beyond coding issues and focus on architecture issues. For example, one respondent mentioned that some initial hacking had resulted in the abuse of a communication protocol for diagnosis and monitoring, resulting in poor extensibility and high maintenance costs. Another common example is where less modular design in the first release due to time constraints affects subsequent releases. Additional functionality could not be added later without doing extensive refactoring. This refactoring impacted future timelines and introduced additional bugs.
This question was motivated by our previous research into technical debt, as well as the high percentage of responses to the previous question that answered with "architecture choice".
For the second research question, we asked participants to rank a randomly ordered list of 14 choices (shown in the image below) "with respect to the amount of debt (1=high, 14=low) they represent on this project." These choices reflect different possible sources, including code, requirements, and architecture, that emerged from a workshop series we have helped organize, detailed in this paper. The image below shows that architecture choice predominates here.
The image shows a stacked bar for each choice, and the total height of each bar reflects the number of survey respondents who selected that choice as either the first-, second-, or third-highest amount of debt on their project.
Our examples offer cases different from "bad code," since decisions are taken earlier and involve more strategic design. For example: "The work that we're doing now to introduce a service layer and also building some clients using other technology is an example of, you know, decisions that could have been done at an earlier stage if we had had more time and had the funding and the resources to do them at the time instead of doing it now."
We see a similar architecture focus in our interviews. Five (of seven) participants told stories about architecture choices in the context of a heavy emphasis on fast delivery of features and limited budget. These choices were framed in terms of development varying from an important architectural decision (in the form of a pattern or application framework) that was no longer followed.
One participant offered an example of the model-view-controller pattern: "In retrospect we put messaging/communication ... in the wrong place in the model view controller architecture which limited flexibility. The correct implementation would put it at the model layer (supporting communication interaction between models) rather than at the presentation layer. As a result modifying or adding new roles requires more work than it should."
While architecture choices were the greatest source of technical debt, dealing with that debt was more problematic. This problem stems from the long-life span of many of these projects and a drift from the original decisions, designs, and documentation. For example, "There were some problems in the infrastructure code where there was originally an architecture in place, but it wasn't necessarily followed consistently. ... So thought had been given to that, but in the implementation... shortcuts were taken and dependencies were not clean." One implication of this drift from original designs is a need for better monitoring of decisions and approaches.
Managing Technical Debt
Our final research question asked
Are there practices and tools for managing technical debt?
In asking this question, our team hoped to gain a greater understanding of what tools and practices organizations used in managing technical debt and whether those tools and practices were effective. A majority of the organizations that we interviewed rely primarily on code-level issue trackers.
Our survey revealed few systemic management practices with 65 percent of respondents having no defined technical debt management practice. Of the remaining respondents, 25 percent managed it at the team level. While there is not an explicit standard approach for managing technical debt, there is some management of technical debt within existing processes. For example, 60 percent of respondents track technical debt as part of risk processes or backlog grooming.
We asked about tool use, and 41 percent do not use tools for managing technical debt (26 percent have no opinion; only 16 percent thought tools were giving appropriate detail). For our question concerning who is aware of technical debt, our respondents (most of whom are developers, architects, or program managers) said executives and business managers were largely unaware (42 percent), and only 10 percent said their business managers were actively managing technical debt.
Specific technical debt tools were rarely used to manage architectural issues, owing to the complexity of configuring them or interpreting results. We collated responses to an open-ended question on tools into the most-frequently cited tool categories, seen in the figure below.
Tool Use as a Percentage of Total Answers. None and Unknown Excluded.
Issue trackers, which include tools such as Redmine, Jira, and Team Foundation Server, were the most prevalent (28 percent). After that, no tool category exceeded 11 percent, including dependency analysis (e.g., SonarQube, Understand), code rule checking (e.g., CPPCheck, Findbugs, SonarQube), and code metrics (e.g., Sloccount).
Our survey revealed that software practitioners agree on the usefulness of the metaphor, notwithstanding different interpretations of what comprises technical debt in particular contexts. There is consensus on McConnell's definition of "a design and construction approach that is expedient in the short term."
Our data and analysis strongly support that the leading sources of technical debt are architectural choices. Architectural choices and their design implications take many years to evolve and, consequently, are hard to plan and fund. It is vital to manage the drift between the preliminary understanding of the problem and the current understanding of the problem, since this drift will have important implications for the solution. This situation is what Cunningham means by "shipping first-time code is like going into debt."
Developers perceive management as unaware of technical debt issues, and they desire standard practices and tools to manage technical debt that do not currently exist.
We suggest that research in technical debt tooling focus on monitoring the gap between development and architecture, improving ongoing architecture analysis and conformance. Tooling is a necessary component of any technical debt management strategy. We are investigating use of the technical debt timeline as a way to map discovered technical debt issues to guide a management strategy. The next post in this series will explore this idea of a timeline in more detail.
Portions of this post were taken from our research paper, to be presented at the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE) in September. More details on the methodology used and analyses conducted can be found there.
This survey is part of a wider SEI effort on technical debt, including an ongoing research effort.
For other posts in our technical debt series, please click here.