Measuring the Impact of Explicit Architecture Documentation

The SEI has long advocated software architecture documentation as a software engineering best practice. This type of documentation is not particularly revolutionary or different from standard practices in other engineering disciplines. For example, who would build a skyscraper without having an architect draw up plans first? The specific value of software architecture documentation, however, has never been established empirically. This blog describes a research project we are conducting to measure and understand the value of software architecture documentation on complex software-reliant systems.

Our research is creating architectural documentation for a major subsystem of Apache Hadoop, the Hadoop Distributed File System (HDFS). Hadoop is a software framework used by Amazon, Adobe, Yahoo!, Google, Hulu, Twitter, Facebook, and many other large e-commerce corporations. It supports data-intensive (e.g., petabytes of data) distributed applications with thousands of nodes. The HDFS is a key piece of infrastructure that supports Hadoop by providing a distributed, high performance, high reliability file system. Although there are two other major components in Hadoop--MapReduce and Hadoop Common--we are initially focusing our efforts on HDFS since it is a manageable size and we have access to two of its lead architects.

The HDFS software has virtually no architectural documentation, which expresses strategies and structures for predictably achieving system-wide quality attributes, such as modifiability, performance, availability, and portability. This project has thus become our "living laboratory" where we can change one variable (the existence of architectural documentation) and examine the effects of this change. We have enumerated a number of research hypotheses to test, including:

product quality will improve because the fundamental design rules will be made explicit,
more users and developers will become contributors and committers to HDFS because it will enable them to more easily learn the framework and thus make useful contributions, and
process effectiveness will improve because more developers will be able to understand the system and work independently.

We will measure the number of project features before and after the introduction of the documentation, where the "before" state becomes the control for our experiment.

We believe the insights gained from this project will be valuable and generalizable because Hadoop exemplifies the types of systems in broad use within the commercial and defense domains. For example, Facebook depends on Hadoop to manage the huge amount of data shared amongst its users. Likewise, the DoD and Intelligence Community use Hadoop to leverage large-scale "core farms" for various "processing, exploitation, and dissemination" (PED) missions. If the existence of architectural documentation yields benefits (or not), we can better influence acquisition policies and development practices for related software-reliant systems.

I along with my research team--Len Bass, Ipek Ozkaya, Bill Nichols, Bob Stoddard, and Peppo Valetto--have been assisting two of the HDFS's architects in reconstructing, documenting, and distributing architectural documentation for the system. To do this, we initially employed reverse engineering tools including SonarJ and Lattix, to recover the architecture. This reverse engineering was only partially successful due to limitations with these tools. These tools are designed to help document the modular structure of the system, which crucially influences modifiability. In HDFS, however, performance and availability are the primary concerns and the tools offer no insight into the structures needed to achieve those attributes. We have therefore undertaken considerable manual architectural reconstruction by interviewing the architects and carefully reading the code.

After we finish developing and distributing the Hadoop HDFS documentation, we will measure the quality of the code base and the nature of the project, including

number of defects
defect resolution time
number of new features
number of product downloads
size (lines of code, number of code modules)
number of contributors and committers

These measurements will provide a time-series of snapshots of these measures as a baseline. We will continue to track these measurements after the introduction of the (shared, publicly available, widely disseminated) architecture documentation to determine how the metrics change over time. We will also conduct qualitative analysis (via questionnaires) to understand how the documentation is being embraced and employed by architects and developers. We will examine the impact of the documentation on the developers' interactions, specifically how it impacts their social network as represented by their email contributions to project mailing lists and comments made in their issue tracking system (Jira). Finally, we will interview key HDFS developers--both contributors and committers--after the introduction of the architecture documentation to gather some insights on their perspective about the usability and understandability of the HDFS code base.

This project is a longitudinal study, which involves repeated observations of the same items over a period of time. It will take time for the architectural documentation to become known and used, so the metrics we are collecting may not manifest themselves right away. Likewise, after the documentation is distributed, it may take a while for it to be assimilated into the Hadoop developer culture, after which point we will be able to measure whether it has made an impact. Within a year, however, we expect to report on the metrics we gathered, as well as qualitative results from surveys and interviews of HDFS developers. Based on this information we will produce a paper describing our methodology and results from creating the documentation.<

Many of the systems that rely on Hadoop are highly complex, with millions of users and emergent behavior. Such systems have been previously characterized as ULS (Ultra Large Scale) systems. We hope our experiment in understanding the consequences of architectural documentation will advance the SEI's research agenda into ULS systems. We look forward to hearing about your experiences applying architectural documentation to software-reliant systems.

Additional Resources:

For more information about the SEI's architecture documentation methods, please visit
www.sei.cmu.edu/architecture/start/documentation.cfm

For more information about the SEI's work in Ultra Large Scale Systems, please visit
www.sei.cmu.edu/uls/index.cfm

Download the SEI technical report, Creating and Using Software Architecture Documentation Using Web-Based Tool Support
www.sei.cmu.edu/library/abstracts/reports/04tn037.cfm?DCSext.abstractsource=SearchResults
Download the SEI technical report, Architecture Reconstruction Guidelines, Third Edition
www.sei.cmu.edu/library/abstracts/reports/02tr034.cfm
Download the SEI technical report, Architecture Reconstruction Case Study
www.sei.cmu.edu/library/abstracts/reports/03tn008.cfm

Download our research study report, Ultra-Large-Scale Systems: The Software Challenge of the Future
www.sei.cmu.edu/library/abstracts/books/0978695607.cfm

Software Engineering Institute

SEI Blog

Measuring the Impact of Explicit Architecture Documentation

Rick Kazman

May 9, 2011

PUBLISHED IN

CITE

TAGS

SHARE

Written By

Rick Kazman

Digital Library Publications

Send a Message

More By The Author

Tactics and Patterns for Software Robustness

July 25, 2022 • By Rick Kazman

Two Categories of Architecture Patterns for Deployability

February 14, 2022 • By Rick Kazman

Rapid Software Composition by Assessing Untrusted Components

November 26, 2018 • By Rick Kazman

Prototyping for Developing Big Data Systems

July 25, 2016 • By Rick Kazman

A Case Study in Locating the Architectural Roots of Technical Debt

May 16, 2016 • By Rick Kazman

More In Software Architecture

The OSATE Slicer: Fast Reachability Query Support for Architectural Models

November 13, 2023 • By Sam Procter

How to Use Docker and NS-3 to Create Realistic Network Simulations

March 27, 2023 • By Alejandro Gomez

Software Isolation: Why It Matters to Software Evolution and Why Everybody Puts It Off

March 20, 2023 • By Mario Benitez Preciado

Experiences Documenting and Remediating Enterprise Technical Debt

December 19, 2022 • By Stephany Bellomo

What Is Enterprise Technical Debt?

December 5, 2022 • By Stephany Bellomo