Shifting from Software Sustainment to Software Engineering in the DoD

In our work with government acquisition programs, we have observed a trend: organic software sustainment organizations are expanding beyond their traditional purview of software maintenance into software engineering and development. As a result, these organizations now also focus on designing and implementing new software architectures and code, rather than just repairing and maintaining legacy software.

Software sustainment and maintenance organizations have a critical role and are in a unique position to provide software updates with ongoing mission impact. The posts in this multi-part series will explore issues that the federal government should address to help organic sustainment teams expand into organic software engineering teams. This post describes key issues that should be addressed by sustainment organizations, sustainment funding activities, and other stakeholders to make this transition successful. These lessons apply to any acquisition program with an interest in building software, whether that program originates from a sustainment background or otherwise.

Motivations for the Shift from Sustainment to Software Engineering

The need for rapid deployment of software-enabled capability, the cost of externally contracted software acquisition, as well as costs associated with the proprietary nature of end-state software without clear government-use rights, is driving the Department of Defense (DoD) to develop organic software engineering capabilities. These capabilities augment the DoD's traditional software maintenance and sustainment organizations and are increasingly using government-managed labs and centers to complement defense industrial base (DIB) contractors.

The issues identified in this blog post stem from the SEI's work as a federally funded research and development center (FFRDC) charged with transitioning software engineering knowledge to our stakeholders so that they are better equipped to acquire, develop, operate, and sustain software for strategic national advantage.

The "Think Globally" Mindset of Software Engineering

Work performed by software engineers typically requires a "think-globally-and-act-locally" approach to developing new software systems or capabilities, which are often created from scratch or extend commercial off-the-shelf (COTS) platforms. Software engineers therefore focus on architecture-and-design concerns throughout the entire software system and its lifecycle. They also focus on how these concerns interact, which is a mindset often tied to acquisition requirements (e.g., the shall statements) and mission drivers (e.g., cost containment, rapid fielding of new features, and government-use rights). Working in teams, software engineers ideally weigh and document the nuanced tradeoffs in architecture and quality attributes and make choices that globally impact the system.

Conversely, software sustainment teams typically adopt more of a "think-locally-and-act-locally" approach that has traditionally focused on debugging and software upgrades, but less commonly requires the creation of entire new subsystems. These teams often inherit little or no documentation from the original system developers and may lack the tools and training that help them understand the design-and-implementation rationale of code they maintain. As a result, when software sustainment teams begin expanding into software engineering efforts, they may not have the resources needed to understand the rationale for new features. Likewise, these teams may not be incentivized to focus on architecture and design concerns.

In contrast, these types of concerns are critical areas of focus for software engineers, particularly for long-lived, mission-critical weapons systems. Understanding the rationale for architecture-and-design decisions equips software engineers with a holistic understanding of the entire system's structure and behavior. Unfortunately, this understanding often gets lost after a system moves into the sustainment phase where fewer incentives exist to take this global view.

The expectations and incentives of any software endeavor may create an overly process-centric environment that places undue emphasis on creating checklists and documents and insufficient emphasis on developing quality software products. Within the sustainment phase of a complex software-reliant system, these expectations and incentives may translate into an environment that fails to properly consider product-centric concerns, such as maintainability, reusability, and performance. As a result, integrations are often motivated solely by non-technical considerations (e.g., only schedule or cost) rather than by subtle tradeoffs that also consider software engineering principles.

For example, integrations often remain localized in one or two places, sometimes with opportunistic reuse, within the system, such as inside the heads of the senior architects and developers or buried deep within the source code (a.k.a. innate knowledge). This approach is risky when key developers and technical leadership change (e.g., via rotations, retirement, or job hopping, which are all common within the public sector) and crucial knowledge is lost. Similarly, reconstructing the canonical architecture-and-design rationale from source code is tedious and error prone. In particular, production source code often lacks guidance to help downstream maintainers understand key decisions made with respect to the system and its overall comprehension.

This limited system view becomes even more problematic when a software sustainment team is tasked with engineering a new system because architecture design is one of the first tasks that they must undertake in this new role. Fortunately, there are methods, tools, and technologies that can help software sustainment teams gain experience in architecture and design. For example, one benefit of model-based engineering is that it provides sufficiently rich semantics and tools that enable software sustainment teams to analyze a domain model and understand the impact that a proposed change will have on a system's performance and functionality.

Beyond the Binary

By the nature of a software engineering team's work, the criteria that they must account for often differ from those of software sustainment teams. In particular, the sheer volume of criteria is larger and involves working with multiple stakeholders while managing more inputs with respect to quality. For example, software engineers must negotiate tradeoffs, periodic formal reviews, and functional and contractual obligations while accounting for various factors that include (but are not limited to) the following:

system overall performance
scalability
end-to-end quality assurance and acceptance testing

A software sustainment team, in contrast, often faces a large (and ever-growing) backlog of problem reports with a legacy system. Success in this environment is thus commonly measured by the number of bugs that were fixed within a system and whether that effort addressed the problem reports. Resolving issues in a problem report is often a solitary activity that doesn't require team collaboration.

For example, there is often a binary yes/no approach to software sustainment that focuses heavily on whether or not a bug has been fixed. In this context, success is typically measured by the number of issues resolved. In turn, this success criterion may yield less communicative teams that focus largely on individual achievement and the number of bugs each team member fixed in a given work unit (e.g., a day or week).

Conversely, software engineering and design require more than a binary yes/no mentality. Multiple concerns must be considered. For example, does the application of a particular set of software patterns satisfy certain desired quality attributes, such as portability and extensibility, without unduly degrading quantitative quality attributes, such as latency and jitter?

Addressing the Skills-and-Pay Gap

Another challenge faced by government acquisition programs when expanding software sustainment organizations to include software engineering responsibilities is assembling a team that possesses the required set of skills and experience. Given the ever-increasing demand for a science, technology, engineering, and mathematics (STEM)-related workforce, government agencies must compete with industry salaries and benefits to attract engineers with appropriate training and experience.

In Current Status of the U.S. Engineering and Computing Workforce 2019, the American Society for Engineering Education (ASEE) reported that the median annual salary for all engineers was $94,600, $55,920 for all engineering technicians and technologists, and $90,270 for all computational professionals. The ASEE also reported that over the next decade, the number of jobs is predicted to increase by 5 percent within engineering occupations, by 2 percent within engineering-technology occupations, and by 13 percent in the computing occupations. For comparison, the total number of jobs in the U.S. workforce is predicted to increase by 5.2 percent. Salaries for software professionals are under constant pressure from demand, and no change to this dynamic is expected.

These skills-and-pay-gap issues are further exacerbated within the federal government, which has a hard time competing with the generous paychecks and benefits that many tech companies offer to students, whom they often recruit even before they graduate. Another persistent and vexing challenge is the significant pay gap between private- and public-sector personnel. This trend has been well documented.

The market pressures exerted from industry create situations in which government teams cannot develop and support extremely long-lived, critical software capabilities without relying heavily on heroics from their workforce. Although heroics sometimes work for limited engagements (e.g., responding to natural disasters or wars), they do not scale over time, and burnout and exodus are often the result.

Government has traditionally paid less for talent and offered other employment perks that have attracted people (e.g., defined benefit pension plans, premium health coverage, and perceived better work and life balance). These benefits are generally being scaled back, however. Moreover, the private sector has generally been quicker to adapt to the priorities of changing workforce demographics, including flexible work schedules and the opportunity for remote employment. COVID-19 has forced some of these changes into government spaces, and these policies, if made permanent, could enhance the attractiveness of civil-service careers for skilled software professionals.

The government offers development tracks to earn advanced degrees, but access to those programs is limited. Moreover, not all software professionals desire these tracks. In particular, software engineers are often more motivated to learn from real-world experiences and thus may gravitate to projects that offer new challenges, such as the opportunity to learn a new programming language or similar hot new technologies. Creating targeted learning opportunities (such as software architecture seminars, software-pattern study groups, or team-/department-wide "hack-a-thons") allow software engineers to hone their skills without committing to a formal degree program. These types of opportunities also enhance team building and enable collaboration outside the conventional workday.

The Incentive for New Tests and Code Coverage

Software sustainment teams and software engineering teams often face different guidelines related to testing and code coverage. Software engineers are held responsible for a percentage of code coverage that is often tied to acquisition requirements, but that same responsibility does not always apply to the software sustainment team. As a result, the acceptance of a new feature under a software engineering policy may differ from the acceptance by a software sustainment team, where new testing may not be incentivized. For example, when a software sustainment team addresses a problem report, they may create a regression test for the exact failing case with a very specific input edge-case value, such as zero, but may not add additional potential edge-case values (e.g., -1) at the same time.

In our experience, software sustainment centers are often not incentivized to add new tests even though the sustainment teams continue to make changes to the code base over time. This misaligned incentive may result in various challenges, some of which are

When a problem has been fixed or a new feature has been added, the software sustainment team may not develop a test for it. Software sustainment organizations are often not able to conduct tests that are representative of the actual deployment environment. As a result, the testing that is conducted may not represent a full regression series.
In a full system integration and capability test, commits that may have broken other parts of the software are often not discovered until much later. The sustainment team therefore may not have a clear picture of what to address if something is broken.
More fundamentally for software sustainment teams, there is no test coverage for, or there is no access to, decisions made by the software engineering team that originally designed and constructed the system.
Software sustainment teams often do not add new tests even if changes are made to the system. In particular, testing or the creation of new tests may not be viewed as their responsibility. Over time, therefore, testing the result of changes to a system is no longer a priority for software sustainment teams.

Insufficiently tested software creates integration and operational problems long after the software is written and deployed, affecting cost, schedule, and mission readiness. Adjusting misaligned incentives is a complex topic involving economic, sociological, and psychological aspects of human and group behaviors. Acquisition-program leadership clearly has the ability to affect the cultural values of any organization. Developing and promoting leadership that incentivizes the balancing of quality-attribute validation and schedules is essential.

For example, leaders who organize Scrum teams dedicated to test generation demonstrate a commitment to software test coverage. Placing that team and its work items on equal footing with development--as well as giving that test team an equal voice in the release process--advances the importance of the cause with all stakeholders in a software-development or sustainment organization.

Software Is Never Done

An excerpt from this 2020 report from the National Academies of Sciences, Engineering, and Medicine on Air Force Software Sustainment and Maintenance of Weapons Systems highlights the need for a software engineering mindset within the DoD, whether a team is transitioning to software engineering or continuing to sustain software.

DoD maintenance policies and maintenance related Congressional statutes have traditionally been optimized for hardware and are difficult to change due to long-standing policies, practices, inertia, and incentives. The goal of hardware maintenance is to repair and restore form, fit, and function. This mindset does not align well with the ever-evolving nature of software. The scope of software engineering for sustainment mitigates defects and vulnerabilities, fact-of-life interface changes, and add new enhancements. Software is never done and any time it is "touched," it triggers the software engineering development life cycle, which produces a new configuration.

Our upcoming posts on software sustainment will examine each of these issues in depth, including transitioning from software sustainment to software engineering, problems observed, and recommendations for transitioning teams.

SEI Blog