Applying DevOps Principles in Incident Response
By Todd Waits
DevOps principles focus on helping teams and organizations deliver business value as quickly and consistently as possible. While the principles advocate for improving the coordination between development and operational teams, they can be adapted for any number of domains. The key components of DevOps we want to emulate across other domains are:
- collaboration between project team roles
- infrastructure as code
- automation of tasks, processes, and workflows
- monitoring of applications and infrastructure
In this blog post, I explore how to apply DevOps to the incident response domain. In the same way that advances in methodologies surrounding software development were gleaned from Toyota's manufacturing processes, we can apply lessons learned from DevOps across domains.
Incident response is the method by which organizations identify and mitigate risks created as the result of a security breach or attack. It involves various teams: analysts, security, operations, legal, business, IT, and even developers. To protect the organization and sensitive data, the primary tasks during an incident are analysis, communication, and response. At the onset of responding to the incident, analysts are trying to collect data in a safe, non-destructive way that maintains the data's authenticity and integrity. Once the data is secured, analysts are able to assess and interpret it. Any findings or conclusions must be communicated in a timely manner to the appropriate parties. The team works to identify the best response or mitigation. In most cases, the transitions between each of these tasks require direct, manual action.
Responding to security incidents, especially in a government or enterprise environment, is a process rich with opportunity for improvement. Response teams are frequently bogged down with writing narrative reports about the analysis performed. Individuals on incident response teams can carry significant caseloads that divide their efforts across multiple projects. Context switching can eat up significant cycles as analysts alternate between cases or task domains (from using tools to writing documents). Relying on human-driven notifications rather than ChatOps principles introduces room for error and missed connections. The very problem the teams are trying to address is constantly changing and rarely well-defined.
The DevOps components mentioned above can greatly increase the ability for incident responders to deliver "business value" to organizations. Defining the business value in relation to a security team seems like an obvious step, but so many organizations continue to treat security as an afterthought or a black-box unit. To implement DevOps across the organization, business value must be defined in a measurable sense for each team. The manual processes at play in much of the work can be identified and automated. If automation is not an immediate possibility, restructuring the workflow or process can help get to the point where the process can be automated. No team ever works in isolation, so incident response teams should be encouraged to work with their counterparts. Providing tools and using automation to increase collaboration will lower the time to fix, and "shift left" the knowledge needed to respond to an incident.
Business Value in Incident Response
Keeping team members focused on delivering business value is key. Organizations therefore need to define what business value means for them. For law enforcement, business value could be defined as identifying and prosecuting perpetrators. For industry incident response, business value could be delivering secure, functional, and usable product to customers.
In some circumstances, incident response may not involve criminal prosecution; it may be as simple as fixing bugs or patching security vulnerabilities to protect the organization's data and customers. While different teams may address different aspects of an organization's mission, it is important to remember that having a unified goal is what prevents a lot of "not my job" conversations.
A key objective of DevOps is breaking down organizational silos and increasing collaboration between teams. Once business value is defined in the context of an organization's mission, it clarifies the types of deliverables that are expected and, most importantly, what types of deliverables should not be executed. Security and incident response teams can focus on mitigating risk to the customer to an acceptable level, or being able to quantify the risk as a means of delivering business value in line with the organization's mission.
Identifying Areas for Automation
Automation increases team productivity by identifying areas where human error is introduced in repetitive tasks. Automation also ensures that a process is repeated precisely and enforces requirements on processes, meaning if requisite information is missing, the automated process can fail a process and ask for human input. Automation provides resource-strapped teams with tools to address thousands of cases or incidents in a given year. Given the consistent outputs from the tools analysts use in their day-to-day work, scripts can be written to dynamically generate narrative reports from the work they do.
An automated process ensures consistent information and presentation and reduces the need for case-by-case interpretation. If errors are discovered in the results, the automated process can be fixed and all reports will immediately reap the benefits. By leveraging tools and the environments in which analysts work, we can keep people working where they are the most useful while offloading the more time-consuming and productivity draining work to automated processes.
Collaboration and Communication
As report-generation tasking is automated, notification and delivery of the artifacts generated by the new processes can also be enhanced. The detected security concerns can notify the proper teams such as operations, development, business units, or legal teams. Automation can generate a ticket in the enterprise issue tracker to patch infrastructure or to alert development teams of code changes for a specific vulnerability. Legal teams can receive updates to ongoing investigations that directly concern their teams. Teams can automate notifications to deliver the information the parties need to perform their job functions and ultimately deliver business value.
The ability to analyze requirements of various departments and align objectives along common business goals, enables organizations to deliver the right information to everyone as early in the process as possible--shift left. In the same way, the use of infrastructure as code establishes environment parity throughout the software development and delivery process. The use of standardized documentation and reporting helps incident response teams reach the right people much faster and with more clarity. For incident response teams, building automated documentation and notifications into the tasks they accomplish will allow the analysts to keep working rather than context switching between analysis and documentation tasking. Instead of waiting for the full narrative, organizations can begin responding to actionable intelligence as the analysis tasks are completed and committed. Shifting incident response left will allow risk mitigations and lessons learned gleaned by the response team to be incorporated as an integral part of the development process rather than as an afterthought.
Wrapping Up and Looking Ahead
Adapting and applying DevOps principles will allow incident responders to better support the organization. Automation allows organizations to have a better understanding about how to use the skill sets incident response teams provide. Applying these principles empowers the organization to approach incident response in a fashion similar to their approach to tracking other metrics. They can quickly identify outliers and focus their efforts where they can yield a higher return on investment for their effort.
Every two weeks, the SEI will publish a new blog post that offers technical guidelines and practical advice for DevOps in practice. We welcome your feedback on this series, as well as suggestions for future content. Please leave feedback in the comments section below.
To view the webinar DevOps Security: Ignore It As Much As You Would Ignore Regular Security by Chris Taschner and Tim Palko, please click here.
To view the webinar Culture Shock: Unlocking DevOps with Collaboration and Communicationwith Aaron Volkmann and Todd Waits please click here.
To view the webinar What DevOps is Not! with Hasan Yasar and C. Aaron Cois, please clickhere.
To listen to the podcast DevOps--Transform Development and Operations for Fast, Secure Deployments featuring Gene Kim and Julia Allen, please click here.
To read all of the blog posts in our DevOps series, please click here.