Developing an Architecture-Focused Measurement Framework for Managing Technical Debt
Managing technical debt, which refers to the rework and degraded quality resulting from overly hasty delivery of software capabilities to users, is an increasingly critical aspect of producing cost-effective, timely, and high-quality software products. A delicate balance is needed between the desire to release new software capabilities rapidly to satisfy users and the desire to practice sound software engineering that reduces rework.
A previous postdescribed the practice of strategically managing technical debt related to software architecture, which involves deliberately postponing implementation of some architectural design choices to accelerate delivery of the system today and then rearchitecting at a later time. This blog post extends our prior post by discussing how an architecture-focused analysis approach helps manage technical debt by enabling software engineers to decide the best time to rearchitect--in other words, to pay down the technical debt.
Our architecture-focused approach for managing technical debt is part of the SEI's ongoing research agenda on Agile architecting, which aims to improve the integration of architecture practices within Agile software development methods. This project is investigating which measures a software development team can apply to effectively monitor changing qualities of software, such as degrading modifiability and extensibility of the system at each iteration in an iterative and incremental lifecycle like Agile, for example. We initially investigated a particular metric--propagation cost--that measures the percentage of system elements affected when a change is made to a randomly chosen element.
A high propagation cost is an indication of tight coupling, such that when a change is made in the system, many parts of the system will be affected. We focused on propagation cost due to the rich set of existing static analysis techniques that evaluate code and design quality by measuring software coupling and cohesion, such as whether there are cycles within parts of the software system, whether there is code duplication, and so on. Most existing static analysis techniques focus on code quality and code-level technical debt. For example, a high percentage of duplicate code and cycles in the code indicates a high level of technical debt. In contrast, we applied propagation cost at the architecture level to calculate the impact of the dependencies looking at architectural elements, rather than calculating every dependency between different classes. The goal of this approach is to reduce complexity and provide insights, even when an implementation is not complete.
Our work explores the relationship between propagation cost and technical debt. In particular, we use propagation cost as one indication of increasing technical debt. We assess the potentially increasing rework--which is effectively the impact of paying back technical debt--based on monitoring increasing propagation cost of the system.
Reasoning about quality by modeling rework as a proxy for technical debt requires objective--and repeatable--representation of architectural properties (such as module dependencies and changing interfaces) for the model to work. We therefore modeled the dependencies of the architectural elements by means of a technique called design structure matrices (DSMs). DSMs can be used to visualize which elements use or depend on others at each iteration and to calculate propagation cost.
Our research on the propagation cost metric examined a real-world case study regarding a building automation control system. The research team had at its disposal the software engineering artifacts of the project, including the software architecture, code, functional and quality attribute requirements, and project management plan. Using these artifacts we generated the design structure matrix of the system at each of the project iterations and completed "what-if" studies. This what-if analysis focused on calculating accumulating rework based on allocating functionality and architectural tasks with different orders to the iterations. We applied propagation cost measurement to calculate the rework and to assess what the team could have done differently in terms of delivering functionality at different times and calculated the overall impact on rework and the lifecycle costs. Our goal was to demonstrate how different allocations of both functionality and critical architectural tasks to iterations can enable developers to respond to changes quicker and use technical debt to their advantage by monitoring the accruing rework.
The results of our studies showed that focusing on architectural dependencies, as well as using propagation cost as a proxy to indicate the level of changing complexity and rework, provided good insight into quantifying technical debt at the architecture level. This insight helps software architects, developers, and managers decide the best time to pay back technical debt or determine if technical debt is accumulating in the first place. To make these measurement and analysis techniques practical, however, they should be integrated seamlessly into engineers' integrated development environments. For example, tools should have the ability to group classes into module view architectural elements and specify design rules, such as one element can or cannot access another element. New generation tools, such as Lattix, Sonargraph, and Structure101, are starting to explore such issues, though there is still room for improvement.
In some instances, architectural dependencies should be also integrated with architectural design decisions. For example, when using a mediator to decouple interfaces from the data model, the mediator communicates with all the interface and data model elements. When applying the propagation cost metric consistently across the system--including all dependencies to and from the controller element--a high propagation cost emerges. This high cost indicates a greater risk of technical debt and change propagation, potentially requiring rework when new features must be added.
Although higher propagation costs are generally associated with higher risk, in this case introducing a mediator to decouple the data model and the interface may be a good architectural decision since it localizes the changes. From a reliability perspective, however, the controller is a single point of failure. So in this case high propagation cost may not necessarily be negative from a modifiability standpoint, but it is still a reliability risk. Our studies revealed that enhancing propagation cost measurements with architectural information provides more insightful analysis of the actual implications of technical debt and rework.
Studying rework using propagation cost helps improve the integration of architecture practices within Agile software development methods. For example, when teams are developing software in an Agile context, they typically embrace Scrum as a project-management technique. It is often hard for teams to determine how to subdivide large architectural tasks and allocate them to small two- to four- week sprints. Our research demonstrated that by focusing on iteration-to-iteration analysis-- rather than trying to time box distribution of functionality to sprints where each time box/sprint has the same duration--it is possible to show customers how the quality of the software changes with each release, such as how increasing propagation cost could impact rework.
Our next steps are to examine the scalability of assessing rework by focusing on architecture metrics. We developed two real-life case studies using dependency analysis to operationalize the measurement of propagation cost. While our approach works fine with 100- to 200 software architecture elements, we are now evaluating how well our approach scales up to a higher number of elements. When our analysis focused on software architecture as opposed to code to quantify technical debt, we observed a magnitude of reduction of dependencies analyzed from about 200 to two dozen, and we were able to pinpoint the potential emerging rework, which is a significant reduction of complexity. Architecture-level analysis of technical debt enables a team to gauge the status of the system quickly and make decisions on whether to rework the system or not. Code-level analysis then enables the team to define specific tasks for developers.
Our research on an architecture-focused measurement framework for managing technical debt is informed by real-world examples gathered from Technical Debt Workshops. These workshops engage practitioners and researchers in an ongoing dialogue to improve the state of techniques for managing technical debt. The 2011 Managing Technical Debt Workshop co-located with the International Conference on Software Engineering (ICSE) revealed an increasing interest in managing technical debt proactively. As a result, we will conduct a third workshop--again collocated with ICSE on June 4, 2012. Our research team will also guest-edit the November/December 2012 issue of IEEE Software on the same theme and is accepting papers until April 1, 2012. We welcome any individuals who have experiences in this area to submit a paper for consideration in IEEE Software or the 3rd International Workshop on Managing Technical Debt.
This research was conducted in collaboration with Dr. Philippe Kruchten, professor of software engineering at the University of British Columbia, and Raghu Sangwan, associate professor of software engineering at Penn State University, and with support from Lattix, a leading provider of software architecture management solutions.
N. Brown, P. Kruchten, R. Nord, and I. Ozkaya. Managing technical debt in software development: report on the 2nd international workshop on managing technical debt, held at ICSE 2011. ACM SIGSOFT Software Engineering Notes 36 (5): 33-35 (2011).
N. Brown, Philippe Kruchten, R. Nord, and I. Ozkaya. Quantifying the Value of Architecting Within Agile Software Development via Technical Debt Analysis. 2011.
N. Brown, R. Nord, I. Ozkaya, and M. Pais. Analysis and Management of Architectural Dependencies in Iterative Release Planning. In Proceedings of the 2011 Ninth Working IEEE/IFIP Conference on Software Architecture (WICSA '11). IEEE Computer Society, 103-112.