search menu icon-carat-right cmu-wordmark

Fabric, Ansible, Amazon AWS, and Netflix: The Top 10 DevOps Posts of 2017 (So Far)

Headshot of Hasan Yasar.
PUBLISHED IN
CITE

In the first six months of 2017, an increasing number of blog visitors were drawn to posts highlighting topics such as secure Devops, successful DevOps implementations at Amazon and Netflix as well as tutorials on using DevOps technologies such as Fabric or Ansible. This post presents the 10 most popular DevOps posts in the first six months of 2017.

10. An Introduction to Secure DevOps: Including Security in the Software Lifecycle

The term "software security" often evokes negative feelings among software developers because it is associated with additional programming effort, uncertainty, and road blocks on fast development and release cycle. To secure software, developers must follow numerous guidelines that, while intended to satisfy some regulation or other, can be very restrictive and hard to understand. As a result, a lot of fear, uncertainty, and doubt can surround software security. This blog post, the first in a series, is based on a keynote Hasan Yasar, recently delivered at the International Conference on Availability, Reliability, and Security (ARES) 2016Hasan describe how include security into DevOps platform while maintain quality, reliability and pace of application deployment speed.

Here is an excerpt:

In the past, software security focused on the nature and origin of attacks, as well as measures for preventing attacks. However, most attacks-especially sophisticated attacks-can't be anticipated, which means that fixes are bolted on as new attacks are discovered.

Read the complete post.

9. Addressing the Detrimental Effects of Context Switching with DevOps

In a computing system, a context switch occurs when an operating system stores the state of an application thread before stopping the thread and restoring the state of a different (previously stopped) thread so its execution can resume. The overhead incurred by a context switch managing the process of storing and restoring state negatively impacts operating system and application performance. In this blog post, Todd Waits describes how DevOps ameliorates the negative impacts that "context switching" between projects can have on a software engineering team's performance.

Here is an excerpt:

In the book Quality Software Management: Systems Thinking, Gerald Weinberg discusses how the concept of context switching applies to an engineering team. From a human workforce perspective, context switching is the process of stopping work in one project and picking it back up after performing a different task on a different project. Just like computing systems, human team members often incur overhead when context switching between multiple projects.

Context switching most commonly occurs when team members are assigned to multiple projects. The rationale behind the practice of context switching is that it is logistically simpler to allocate team members across projects than trying to have dedicated resources on each project. It seems reasonable to assume that splitting a person's effort between two projects yields 50 percent effort on each project. Moreover, if a team member is dedicated to a single project, that team member will be idle if that project is waiting for something to occur, such as completing paperwork, reviews, etc.

Read the complete post.

8. Adding Security to Your DevOps Pipeline

Gauntlt is a security testing tool that incorporates existing organizational tools that are used for security testing, i.e. nmap. This tool is built on the Cucumber framework and provides an easy to read language that gives users the ability to write their own attacks. Gauntlt also provides example attacks for most common attacks on its Github repo. A Gauntlt attack file consists of primarily two sections: the background and the scenarios. The background section ensures that the packages needed for the attacks are installed and that developers can connect to the host that is being tested for security. The scenario section is where the attacks, or tests, are actually performed. Operational staff can have an unlimited number of scenarios, but it is best practice to have a small number of them. That way each attack file will not get too broad.

I have prepared a short walkthrough that shows how to set up to run Gauntlt tests for your web application. I will be using the Gauntlt starter kit to set up the environment quickly. You will need Vagrant, VirtualBox, and Git installed on your local machine for this demo.

Read the complete post.

7. Whitebox Monitoring with Prometheus

Here is an excerpt:

Many of the organizations we work with use Nagios as their only monitoring tool for systems and applications. This it works well for its intended use, it does not provide complete details. Recently, in our test environment, Nagios alerted me that the disk space was 90 percent full. The disk space check worked as expected, but did not indicate why the disk was filling up. More importantly, it didn't indicate how much time I had until the system was no longer useable. Whitebox monitoring would allow me to see the rate at which the disk was filling, and this critical information could result in a good night's sleep for the system administrator if the trends imply the disk would be useable for the next 12 hours. The problem causing my application's disk to fill up was excessive logging caused by a lengthy database query. This discovery led to more discoveries, and I eventually narrowed the problem to two services we had running on separate machines.

Read the complete post

6. DevOps Case Study: Amazon AWS

Here is an excerpt:

Amazon is one of the most prolific tech companies today. Amazon transformed itself in 2006 from an online retailer to a tech giant and pioneer in the cloud space with the release of Amazon Web Services (AWS), a widely used on-demand Infrastructure as a Service (IaaS) offering. Amazon accepted a lot of risk with AWS. By developing one of the first massive public cloud services, they accepted that many of the challenges would be unknown, and many of the solutions unproven. To learn from Amazon's success we need to ask the right questions. What steps did Amazon take to minimize this inherently risky venture? How did Amazon engineers define their process to ensure quality?

Luckily, some insight into these questions was made available when Google engineer Steve Yegge (a former Amazon engineer) accidentally made public an internal memo outlining his impression of Google's failings (and Amazon's successes) at platform engineering. This memo (which Yegge has specifically allowed to remain online) outlines a specific decision that illustrates CEO Jeff Bezos's understanding of the underlying tenets of what we now call DevOps, as well as his dedication to what I will claim are the primary quality attributes of the AWS platform: interoperability, availability, reliability, and security.

Read the complete post.

5. A DevOps a Day Keeps the Auditors Away (and Helps Organizations Stay in Compliance with Federal Regulations such as Sarbanes-Oxley)

Here is an excerpt:

Read the complete post.

4. Monitoring in the DevOps Pipeline

Here is an excerpt:

Before getting into the state of monitoring in DevOps, I want to take a minute to discuss tooling. Because there are so many products trying to promote monitoring, choosing among them can be distracting. It is best to, first, understand what holds business value to you and your customers. You should also recognize that not everything that can be monitored should be monitored. Discussing any particular tool isn't the aim of this post, but it is worth noting that the most vendors offer free trials and many other products are simply free, so it might be worth your time to sample a few after you have determined your monitoring strategy.

Infrastructure and service monitoring have been around long before DevOps, so how does DevOps really affect monitoring strategy, and is DevOps even needed for monitoring? Strangely, yes, in a way.

Read the complete post.

3. DevOps Case Study: Netflix and the Chaos Monkey

Here is an excerpt:

Netflix is a fantastic case study for DevOps because their software-engineering process shows a fundamental understanding of DevOps thinking and a focus on quality attributes through automation-assisted process. Recall, DevOps practitioners espouse a driven focus on quality attributes to meet business needs, leveraging automated processes to achieve consistency and efficiency.

Netflix's streaming service is a large distributed system hosted on Amazon Web Services (AWS). Since there are so many components that have to work together to provide reliable video streams to customers across a wide range of devices, Netflix engineers needed to focus heavily on the quality attributes of reliability and robustness for both server- and client-side components. In short, they concluded that the only way to be comfortable handling failure is to constantly practice failing. To achieve the desired level of confidence and quality, in true DevOps style, Netflix engineers set about automating failure.

Read the complete post.

2. DevOps Technologies: Fabric or Ansible In the blog post DevOps Technologies: Fabric or Ansible, CERT researcher Tim Palko highlights use cases associated with the DevOps deployment process, including evaluating resource requirements, designing a production system, provisioning and configuring production servers, and pushing code to name a few.

Here is an excerpt:

One difference between Fabric and Ansible is that while Fabric will get you results in minutes, Ansible requires a bit more effort to understand. Ansible is generally much more powerful since it provides much deeper and more complex semantics for modeling multi-tier infrastructure, such as those with arrays of web and database hosts. From an operator's perspective, Fabric has a more literal and basic API and uses Python for authoring, while Ansible consumes YAML and provides a richness in its behavior (which I discuss later in this post). We'll walk through examples of both in this posting.

Here is an excerpt:

A cornerstone of DevOps is continuous integration (CI), a technique designed and named by Grady Booch that continually merges source code updates from all developers on a team into a shared mainline. This continual merging prevents a developer's local copy of a software project from drifting too far afield as new code is added by others, avoiding catastrophic merge conflicts. In practice, CI involves a centralized server that continually pulls in all new source code changes as developers commit them and builds the software application from scratch, notifying the team of any failures in the process. If a failure is seen, the development team is expected to refocus and fix the build before making any additional code changes. While this may seem disruptive, in practice it focuses the development team on a singular stability metric: a working automated build of the software.

Recall that a fundamental component of a DevOps approach is that to remove disconnects in understanding and influence, organizations must embed and fully engage one or more appropriate experts within the development team to enforce a domain-centric perspective. To remove the disconnect between development and sustainment, DevOps practitioners include IT operations professionals in the development team from the beginning as full team members. Likewise, to ensure software quality, QA professionals must be team members throughout the project lifecycle. In other words, DevOps takes the principles of Agile and expands their scope, recognizing that ensuring high quality development requires continual engagement and feedback from a variety of technical experts, including QA and operations specialists.

Wrapping Up and Looking Ahead

In the remaining months of 2017, the DevOps blog will continue to publish guidelines and practical advice to organizations seeking to adopt DevOps in practice with upcoming posts on the following topics:

  • Security analysis of deployment scripts thought DevOps pipeline
  • How to choose right container to deploy your application
  • Secure DevOps assessment in Highly Regulated Environments(HRE)

We welcome your feedback on the DevOps blog as well as suggestions for future content. Please leave feedback in the comments section below.

Additional Resources

To view the webinar DevOps Panel Discussion featuring Kevin Fall, Hasan Yasar, and Joseph D. Yankel, please click here.

To view the webinar Culture Shock: Unlocking DevOps with Collaboration and Communication with Aaron Volkmann and Todd Waits please click here.

To view the webinar What DevOps is Not! with Hasan Yasar and C. Aaron Cois, please click here.

To listen to the podcast DevOps--Transform Development and Operations for Fast, Secure Deployments featuring Gene Kim and Julia Allen, please click here.

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed