Infrastructure as Code: Moving Beyond DevOps and Agile
Citing the need to provide a technical advantage to the warfighter, the Department of Defense (DoD) has recently made the adoption of cloud computing technologies a priority. Infrastructure as code (IaC), the process and technology of managing and provisioning computers and networks (physical and/or virtual) through scripts, is a key enabler for efficient migration of legacy systems to the cloud. This blog post details research aimed at developing technology to help software sustainment organizations automatically recover the deployment baseline and create the needed IaC artifacts with minimal manual intervention and no specialized knowledge about the design of the deployed system. This project will develop technology to automatically recover a deployment model from a running system and create IaC artifacts from that model.
Infrastructure as Code
When moving into a virtualized environment such as a cloud or modern data center, the hardware is simulated by virtualization. Infrastructure as code (IaC), which is usually associated with Agile and DevOps, allows us to duplicate an exact environment that
- provides automated deployment to the integration environment and repeatability through immutable infrastructure
- enables exploration and experimentation for design evolution or defect fixing by providing environment versioning and rollback
- ensures parity of test and integration environments across locations and organizations
Adopting IaC is not easy, however, and the goal of this project is to reduce the barriers to adoption. Leading commercial software development organizations apply DevOps practices from the start and do not consider retrofitting these practices into a legacy context where this is little knowledge about the system design.
In contrast, DoD sustainment organizations, including software engineering centers (SECs) and software maintenance groups (SMGs), are focused on maintaining legacy systems where they often have little knowledge of the system design. SECs and SMGs want to realize the benefits of IaC and must first recover the technical baseline for the system deployment.
Recovering the technical baseline for system deployment is no easy task. For legacy systems, there will be no IaC artifacts. In other cases, the government has no data rights to the contractor's IaC artifacts, and the only authoritative artifact is an instance of the deployed system.
Another challenge to widespread DoD adoption of IaC practices is that deployment scripts are manually coded and require specialized skills and knowledge of infrastructure design, deployment tools, and internal design of the system.
Finally, IaC tools are open source and rapidly evolving; infrastructure code needs to evolve to stay current even if there is no ongoing active sustainment.
Breaking down these barriers represents a specific research challenge:
Can the deployment structure be automatically recovered from an instance of the running system?
Our work on this project focuses on developing technologies to automatically recover the deployment baseline (not the entire technical baseline) and create the needed IaC artifacts with minimal manual intervention and no specialized knowledge about the design of the deployed system.
As the figure above demonstrates, our approach focuses on automatically recovering a deployment model from an instance of a system that is deployed and running, and generating IaC scripts from that model. Specifically, our work is in three areas:
- Analyzer. Analysis is a multi-step process that includes scraping information about the files on the system, compiling a list of the files, and then analyzing and categorizing each file. This categorization includes the following:
- Was the software part of the operating system?
- Were there other packages installed?
- Which is the application software?
- Where are the configuration files?
For example, as software packages are installed on a system, an inventory or list of packages is created. During our analysis, we go to the package repository to identify all of the files associated with that package installation. We are developing heuristics and ad hoc rules to find other associated files.
- Deployment model. We are developing an approach to modeling the deployment of the system and representing information about nodes and their contents. We plan to make our model compatible with an open standard for infrastructure modeling that already exists, the OASIS TOSCA project (Topology and Orchestration Specification for Cloud Applications).
- Generator. Finally, the system model is used to generate the IaC deployment scripts. There are tradeoffs between what information goes into the model and what information is used to configure the generator.
It should be noted that having a model of the system deployment enables other use cases. For example, the deployment model can be created during development for a greenfield project, and we could use the model to support architecture analysis and system assurance.
Automating Our Approach with DevOps
Working with SEI colleagues Doug Reynolds and Hasan Yasar, we focused on automating this approach with DevOps tools, such as Chef, Puppet, and Ansible, to make it more adoptable by software engineers not well versed in the IaC domain.
Validating Our Approach
Having completed the implementation of the prototype tools, we are now focused on evaluating our approach to assess its feasibility. To assess how well the system works in practice, we are asking test subjects to use the tool against an open-source system. As proxies for DoD maintenance organization staff, these test subjects will be software developers with no system-specific or IaC experience. We will observe and measure our test subjects as they beta test our approach on systems of increasing complexity.
Wrapping Up and Looking Ahead
Since our work on this project began, the DoD has placed even greater emphasis on moving to cloud infrastructures, a transition for which IaC is essential. IaC can provide benefits outside of Agile and DevOps, including portability across infrastructure-as-a-service (IaaS) providers, automatically generating assurance evidence, and even cyber resiliency support through moving target defense by transforming and re-deploying the system.
In the future, using this model-based approach, we plan to examine whether we can refactor the deployment model and then generate new artifacts for a different system. For example, is it possible to use this approach to generate an architecturally equivalent system with a different attack surface, which might enhance cybersecurity? Considering approaches such as moving target defense, can we generate a family of equivalent systems and modify the attack surface seen by an adversary?
These are new research areas that are made possible by a model-based approach to infrastructure as code.
Read the SEI Blog Post, Why You Should Apply Agile-DevOps Earlier in the Lifecycle.