search menu icon-carat-right cmu-wordmark

Using Scenario-Based Architecture Analysis to Inform Code Quality Measures

Robert Nord Headshot of Ipek Ozkaya.

As the pace of software delivery increases, organizations need guidance on how to deliver high-quality software rapidly, while simultaneously meeting demands related to time-to-market, cost, productivity, and quality. In practice, demands for adding new features or fixing defects often take priority. However, when software developers are guided solely by project management measures, such as progress on requirements and defect counts, they ignore the impact of architectural dependencies, which can impede the progress of a project if not properly managed. In previous posts on this blog, my colleague Ipek Ozkaya and I have focused on architectural technical debt, which refers to the rework and degraded quality resulting from overly hasty delivery of software capabilities to users. This blog post describes a first step towards an approach we developed that aims to use qualitative architectural measures to better inform quantitative code quality metrics.

Technical debt is an increasingly critical aspect of producing cost-effective, timely, and high-quality software products. Recently, our research has focused on going beyond debt as a metaphor to investigating which measures a software development team can apply to effectively monitor changing qualities of software. These measures can take advantage of code quality if the goal is to optimize development qualities.

Existing code measures alone, however, do not provide insight into overall architectural improvements due to the increasing complexity and context dependencies of software systems. We must investigate a range of measures to provide a multi-view architectural perspective of design time, run-time, and deployment time qualities. The goal of our research is to provide an architectural measurement framework that can assist in monitoring and improving high architectural risk areas of a system.

Informing Quantitative Metrics with Qualitative Measures

Developers can apply off-the-shelf tools (such as Lattix, SonarGraph, SonarQube, Structure101) to understand architectural dependencies for change impact or rework analysis that rely on code metrics (such as stability, coupling, cohesion, cyclicity, complexity, etc.). These metrics are often helpful in improving code quality and can provide structural information about architectural dependencies and modifiability. Recent research has demonstrated, however, that such metrics fall short of providing overall architectural system guidance when used as they are. The question we ask is whether the relevance and use of such metrics can be improved to provide architectural guidance, as well.

To address this question, we developed an approach to contextualize and focus the application of dependency analysis and architecture-relevant code quality and system modifiability metrics using architecture evaluations. Scenario-based architecture analysis offers a broad understanding of how a software-reliant system evolves over time and can form a basis for assessing the amount of rework that may be necessary in the foreseeable future. Using the architectural risks identified during scenario-based architecture analysis, we clarified the level of system decomposition where code quality metrics reveal relevant information.

As outlined in our January 2012 blog post, Developing An Architecture Focused Measurement Framework for Managing Technical Debt, our research on this topic is informed by real-world examples gathered from technical debt workshops. That blog post, authored by my colleague and co-author, Ipek Ozkaya, noted that an architecture-focused analysis approach helps manage technical debt by enabling software engineers to decide the best time to re-architect, thereby reducing the technical debt.


Our earlier work is part of an ongoing SEI research agenda to improve the integration of architecture practices within agile software development methods. To test our approach, we evaluated CONNECT, which is an open-source software system to exchange patient information among different healthcare providers at the local and national level.

The developers of CONNECT used Scrum as their agile project management approach. They hold biweekly sprints (117 as of late 2013) and periodically release updates on the system (typically every quarter.) Code development is outsourced and takes place in different offices. Periodic code sprints bring all the developers together to synchronize their work.

We chose to focus a portion of our research in the context of CONNECT since a team of SEI researchers was asked in November 2011 to complete an evaluation of the system that focused on quality attribute goals using the Architecture Tradeoff Analysis Method (ATAM). This analysis yielded a list of potential risks that the project needed to address. Among the risks cited, researchers referenced Adapter/Gateway separation:

CONNECT initially separates the handling of messages from the integration with other systems. However, these roles have become confused over time, and it is not clear how the roles should be separated.

The remainder of this post focuses on this risk theme that the ATAM identified as an area of major concern.

While the ATAM provided a list of risks the project needed to address, we considered it to be a point-in-time representation of the project. A common response is to focus on short-term fixes at the expense of underlying causes for those risks, which are typically architectural in nature. In our examination of CONNECT, we tried to understand how architectural scenarios, developed during the architectural evaluation, impacted project narrative in the next major release, which included changes implemented as a result of the ATAM.

In the case of CONNECT, we examined the JIRA issue tracker, which contained sprint and product backlogs. We looked at the feature requests and improvements in the backlog both before and after the ATAM and saw that it had an impact on what the development team worked on as there was a 22 percent increase in risk-related issues created after the ATAM was conducted. The CONNECT system underwent a significant re-architecting effort to reduce its dependency between the Adapter and the Gateway.

Next, we sought to reconcile the risk themes identified in the ATAM with the dependencies extracted from the code and the automatically generated code quality measures. To analyze whether common modularity code metrics reflect the impact of changes related to the architectural risks and recommendations, we compared the baseline version of CONNECT with the next version released after the developers re-architected the Adapter/Gateway dependency to mitigate the risk. We analyzed the code at three levels of decomposition in the code hierarchy:

  • system-level decomposition. We analyzed the code of the entire CONNECT system, including middleware and all of its third-party dependencies. Overall, the modifiability metrics show improvements in atom count, internal dependencies, average impact, system stability, connectedness, connected strength, coupling, coupling strength, and system cyclicity.
  • software-level decomposition. We analyzed the code from gov.hhs including packages, interfaces, and classes for CONNECT middleware, with the exception of third-party libraries. We found that the metrics demonstrate that the numbers of atoms and internal dependencies decreased, indicating that the package contained fewer code artifacts. System cyclicity and connectedness increased, indicating a higher likelihood of change. Given that system stability measured 98 percent, the system appears to be highly stable regardless of the architecture change at the package level of decomposition. For this case, where the stability metric did not change, a lower connectedness strength in the next release suggests the design would be architecturally less sensitive to change.
  • package-level decomposition. At this lowest level of decomposition, we focused on the document query package gov.hhs.fha.nhinc.docquery, which reflected the architectural changes based on Adapter-Gateway integration risk. At this level of decomposition, the system size is significantly reduced as is the average impact. This metric indicates that changes to the document query system will be less likely to resonate throughout the rest of the system. While the system stability level at this metric improved, we noted a decline in connectedness and connected strength.

The stability metric, which provides a system's overall sensitivity to changes, reliably reflected the system's architecture when appropriately focused on the CONNECT system's selected decomposition level, and excluding the rest of the system. Navigating down the hierarchy in this manner allowed us to see where the code metrics show any significant change, thereby indicating whether the architecture has improved or deteriorated during its evolution. Spotting areas of rework requires understanding the context provided by architecture and quality concerns that influence architectural evolution. When focused on the problematic elements of the system, at a suitable level of decomposition in the package structure, selected code-based metrics for assessing ripple effects reflect the improvements on the architecture where rework has been done.

When we included the entire system in our stability assessment, however, the system stability measurement reported that it was close to 100 percent, given the large number of elements that falsely indicated a stable system. We found that applying existing metrics consistently requires the ability to choose appropriate elements of the system at a suitable level of decomposition because dependency analysis is quite sensitive to the size of the graph and its context, as our results demonstrated.

Looking Ahead

Our research aims to bring the architecture analysis and developer environments closer together. Our goal is to create repeatable analysis and validation on metrics that provide architectural information and scale to systems of realistic size.
On another front, we have joined forces with members of the architecture and metrics community to host a workshop on software architecture metrics. Our aim is to improve measurement techniques for architecture that yield reliable, consistent, and repeatable results by

  • discussing progress on architecture metrics, measurement, and analysis
  • gathering empirical evidence on the use and effectiveness of metrics
  • identifying priorities for future research

The workshop, which will be held in April 2014, will bring together a cross-section of experts in academia and industry in the areas of dependency analysis, architecture metrics, analysis and evaluation, software analytics, empirical software engineering and measurement.

Additional Resources

For more information about the First International Workshop on Software Architecture Metrics, which will be held April 7, 2014 in conjunction with the Working IEEE/IFIP Conference on Software Architecture (WICSA), or to submit a paper, please visit

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed