SEI Insights

DevOps Blog

Technical Guidelines and Practical Advice for DevOps

Change is hard. When we help teams adopt DevOps processes or more general Agile methodologies, we often encounter initial resistance. When people learn a new tool or process, productivity and enthusiasm consistently dip, which is known as the "implementation dip." This dip should not be feared, however, but embraced. In his book Leading in a Culture of Change, Michael Fullan defines the implementation dip as "a dip in performance and confidence as one encounters an innovation that requires new skills and new understandings."

A shift to DevOps is a shift to constantly changing and improving tools and processes. Without deliberate steps, we could thrust our team into a constant cycle of implementation dip. In this blog post, I present three strategies for limiting the depth and duration of the implementation dip in software development organizations adopting DevOps.

In the ever-changing world of DevOps, where micro-services and distributed architectures are becoming the norm, the need to understand application internal state is growing rapidly. Whitebox monitoring gives you details about the internal state of your application, such as the total number of HTTP requests on your web server or the number of errors logged. In contrast, blackbox testing (e.g., Nagios) allows you to check a system or application (e.g., checking disk space, or pinging a host) to see if a host or service is alive, but does not help you understand how it may have gotten to the current state. Prometheus is an open source whitebox monitoring solution that uses a time-series database to provide scraping, querying, graphing and alerting based on time-series data. This blog post briefly explores the benefits of using Prometheus as a whitebox monitoring tool.

It has been nearly a year since the DevOps blog launched its own platform. In the nearly 12 months since our launch, we have offered guidelines, practical advice, and tutorials to the ever-increasing number of organizations adopting DevOps (up 26 percent since 2011). In the first six months of 2016, an increasing number of blog visitors were drawn to posts highlighting successful DevOps implementations at Amazon and Netflix as well as tutorials on new technologies such as Otto, Fabric, Ansible, and Docker. This post presents in descending order (with #1 being the most popular) the 10 most popular DevOps posts published in the first six months of 2016.

In this DevOps revolution, we are trying to make everything continuous: continuous integration, continuous deployment, continuous monitoring--the list goes on. One term you rarely hear, however, is continuous security, because it is often seen as an afterthought when building and implementing a delivery pipeline. The pipeline I will be discussing has six components: plan, code, build, test, release, and operate. There is also a seventh, less-formal component, which is the iterative nature of the delivery pipeline in a DevOps environment. Security can, and should, be implemented throughout the pipeline. In this blog post, I discuss how security can be implemented in all components, as well as what benefits come from implementing security in each of those components.

DevOps practices can increase the validity of software tests and decrease risk in deploying software changes to production environments. Anytime a software change is deployed to production, there is a risk that the change will break and lead to a service outage. This risk is minimized through rigorous testing of the software in a separate test environment where the change can be safely vetted without affecting normal business operations. Problems can arise, however, when these isolated test environments do not properly mimic the production environment. Sometimes a test environment will have different operating system patches installed, different software dependencies installed, different firewall rules, or even different data in its database. These differences open the door to risk because even if the software change passes testing in the test environment, there is a chance of failure in production because it was never before tested in that context. In this blog post, I explore how the DevOps practices of infrastructure as code and automated test execution through continuous integration increase the effectiveness of software testing, allowing us to create test environments that more closely match production.

A few years ago, my team took the task of designing and writing a new (and fairly large) web application project that required us to work collaboratively on features, deploy to unfamiliar environments, and work with other teams to complete those deployments. Does this sound like DevOps yet? Our task was to make these deployments happen with limited resources; however, we didn't want to sacrifice environment parity or automation, knowing that these would help our team lower risk for the project and maintain a better handle on the security of our process. The idea of using Chef, a leading suite of platform-independent infrastructure management and provisioning tools, came to mind; however the work that would have been required to implement the full Chef ecosystem was not in scope for the project. We realized, however, that we were already using Vagrant to provision our local environments in the same way we wanted to provision our remote environments. That is, our Vagrant-based workflow was applying Infrastructure as Code concepts and provisioning with just a single component of Chef, rather than depending on a larger suite of Chef tools. To solve our remote deployment problem, we furnished a solution that allowed us to maintain environment parity by reusing all of the existing Chef configuration while sharing it with Vagrant and keeping the deployment for any size system to a single, automatable line. In this blog post, I will be talking about the transformation of a vanilla Vagrant setup to a stable and testable Infrastructure as Code solution.

I am often asked how to help DevOps organizations improve their software and system security by integrating security testing into their new and expanding continuous integration (CI) environment. The first thing I say is, "It is great that you are treating security testing as important a task as other software tests." Security testing is often overlooked or simply manually done at the end of a software release cycle, if at all. When I ask them, "What type of security testing do you currently do?" I often hear excuses about the lack of time, funding, or planning as the reason they do not currently perform any security testing at all. DevOps organizations typically do not have a security testing plan in place, have not really given it much thought, and hope that CI alone can help make their software more secure. As this blog post will detail, CI certainly can help improve your application security if you are already automating security testing, but you must first have a security plan in place.

Traditionally, DevOps practitioners think of business value as simply measuring the difference between money earned and money spent. In that line of thinking, security is often relegated to a secondary goal because it fails to directly drive revenue. The misguided goal is to deliver functionality at all costs, even if it compromises the integrity of the system or data. As Rob Joyce, head of the National Security Agency's Tailored Access Operations group, mentions in his talk at Usenix Enigma conference: "Don't assume a crack is too small to be noticed, or too small to be exploited... We need that first crack, that first seam. And we're going to look and look and look for that esoteric kind of edge case to break open and crack in." In this blog post, I present two concepts: malicious user stories and rejection criteria that can be used in DevOps to secure systems.