Deep Learning, Agile-DevOps, and Cloud Security: The Top 10 Blog Posts of 2018
TAGSAutonomy and Counter-Autonomy Cloud Computing Data Modeling and Analytics Mission Assurance Software and Information Assurance
Every January on the SEI Blog, we present the 10 most-visited posts of the previous year. This year's top 10, which features posts published between January 1, 2018, and December 31, 2018, brought an ever-increasing number of visitors to the blog.
10. Why You Should Apply Agile-DevOps Earlier in the Lifecycle 9. Best Practices and Considerations in Egress Filtering
8. Deep Learning: Going Deeper toward Meaningful Patterns in Complex Data 7. Why Does Software Cost So Much?
6. Revealing True Emotions through Micro-Expressions: A Machine Learning Approach
5. Translating Between Statistics and Machine Learning 4. Best Practices for Cloud Security 3. Security Begins at the Home Router
2. 10 Types of Application Security Testing Tools: When and How to Use Them 1. 12 Risks, Threats, and Vulnerabilities in Moving to the Cloud
10. Why You Should Apply Agile-DevOps Earlier in the Lifecycle by Hasan Yasar and Suzanne Miller
DevOps is a set of development practices that emphasizes collaboration, communication, and automation throughout the application lifecycle. In DevOps, all stakeholders--including IT operations staff, testers, developers, customers, and security personnel--are embedded from the inception of the project to its end. This blog post describes SEI research and customer engagements aimed at applying DevOps practices that are typically used at the end of the lifecycle to automate governance at the beginning of the development timeline.
DevOps works best in organizations that attempt to follow the Agile tenets and principles found in the Agile Manifesto. Such organizations value iterative, incremental progress and strong collaboration with users and customers. Development cycles in Agile organizations are based on short (1- to 3-week) iterations and small teams that focus on delivering working software to engaged users who can provide ongoing feedback as the solution evolves. This process is compatible with DevOps, which involves the crossing of organizational boundaries and collaborating with stakeholders outside of the development organization.
In its role as a DoD federally funded research and development center, the SEI focuses on how to improve and streamline acquisition practices for software-reliant systems; specifically, we are researching how Agile and DevOps principles, which were originally developed for industry, can be applied to speed introduction of new technology in government settings.
9. Best Practices and Considerations in Egress Filtering
by Rachel Kartch
When considering best practices in egress filtering, it is important to remember that egress filtering is not focused on protecting your network, but rather on protecting other organizations' networks. For example, the May 2017 Wannacry Ransomware attack is believed to have exploited an exposed vulnerability in the server message block (SMB) protocol and was rapidly spread via communications over port 445. Egress and ingress filtering of port 445 would have helped limit the spread of Wannacry. In this post--a companion piece to Best Practices for Network Border Protection, which highlighted best practices for filtering inbound traffic--I explore best practices and considerations for egress filtering.
Egress filtering can be a time-consuming practice with few immediate incentives; however, when executed, this practice might have mitigated damage in a number of malicious incidents. The 2016 Dyn attack involved infected devices sending large amounts of traffic over port 53 (DNS) to target Dyn's infrastructure. If your organization has its own DNS servers for use by internal clients, there should be no reason to allow all internal devices to send outbound traffic over port 53. This simple practice might limit the impact of similar attacks.
8. Deep Learning: Going Deeper toward Meaningful Patterns in Complex Data by Carson Sestili
In a previous blog post, we addressed how machine learning is becoming ever more useful in cybersecurity and introduced some basic terms, techniques, and workflows that are essential for those who work in machine learning. Although traditional machine learning methods are already successful for many problems, their success often depends on choosing and extracting the right features from a dataset, which can be hard for complex data. For instance, what kinds of features might be useful, or possible to extract, in all the photographs on Google Images, all the tweets on Twitter, all the sounds of a spoken language, or all the positions in the board game Go? This post introduces deep learning, a popular and quickly-growing subfield of machine learning that has had great success on problems about these datasets, and on many other problems where picking the right features for the job is hard or impossible.
Complexity out of simplicity
A mathematical description of a dataset, with certain properties that can be adjusted to make the description more accurately represent the data, is called a model. All machine learning techniques have the goal of finding the best model, where the meaning of "best" can vary by context (easiest to understand, most accurate representation, or least likely to make costly mistakes, to name a few). Observable characteristics in the data, which can be given as input to a model, are called features, and a model's ability to perform well always depends on finding features that represent the data well.
Deep learning refers to a family of machine learning techniques whose models extract important features by iteratively transforming the data, "going deeper" toward meaningful patterns in the dataset with each transformation. Unlike traditional machine learning methods, in which the creator of the model has to choose and encode features ahead of time, deep learning enables a model to automatically learn features that matter. In this way, a deep learning model learns a representation of the dataset, making deep learning part of the larger field of representation learning.
7. Why Does Software Cost So Much?
by Robert Stoddard
Cost estimation was cited by the Government Accountability Office (GAO) as one of the top two reasons that DoD programs continue to have cost overruns. How can we better estimate and manage the cost of systems that are increasingly software intensive? To contain costs, it is essential to understand the factors that drive costs and which ones can be controlled. Although we understand the relationships between certain factors, we do not yet separate the causal influences from non-causal statistical correlations. In this blog post, we explore how the use of an approach known as causal learning can help the DoD identify factors that actually cause software costs to soar and therefore provide more reliable guidance as to how to intervene to better control costs.
As the role of software in the DoD continues to increase so does the need to control the cost of software development and sustainment. Consider the following trends cited in a March 2017 report from the Institute for Defense Analysis:
- The National Research Council (2010) wrote that "The extent of the DoD code in service has been increasing by more than an order of magnitude every decade, and a similar growth pattern has been exhibited within individual, long-lived military systems."
- The Aerospace Vehicle Systems Institute (2017) states that source lines of code (SLOC) in aircraft (both military and commercial) has been doubling approximately every four years. This trend corresponds to an annual growth rate of close to 18 percent.
- The Army (2011) estimated that the volume of code under Army depot maintenance (either post-deployment or post-production support) had increased from 5 million to 240 million SLOC between 1980 and 2009. This trend corresponds to approximately 15 percent annual growth.
- A December 2017 Selected Acquisition Report (SAR) showed cost growth in large-scale DoD programs is common, with a $91 billion cost growth to-date (engineering and estimating) in the DoD portfolio. Poor cost estimation, including early lifecycle estimates, represents almost $8 billion of the $91 billion.
The SEI has a long track record of cost-related research to help the DoD manage costs. In 2012, we introduced Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE) as a method for improving pre-Milestone-A software cost estimates through research designed to improve judgment regarding uncertainty in key assumptions (which we term "program change drivers"), the relationships among the program change drivers, and their impact on cost.
6. Revealing True Emotions through Micro-Expressions: A Machine Learning Approach
by Satya Venneti
Micro-expressions--involuntary, fleeting facial movements that reveal true emotions--hold valuable information for scenarios ranging from security interviews and interrogations to media analysis. They occur on various regions of the face, last only a fraction of a second, and are universal across cultures. In contrast to macro-expressions like big smiles and frowns, micro-expressions are extremely subtle and nearly impossible to suppress or fake. Because micro-expressions can reveal emotions people may be trying to hide, recognizing micro-expressions can aid DoD forensics and intelligence mission capabilities by providing clues to predict and intercept dangerous situations. This blog post, the latest highlighting research from the SEI Emerging Technology Center in machine emotional intelligence, describes our work on developing a prototype software tool to recognize micro-expressions in near real-time.
Macro-Expressions vs. Micro-Expressions: Tools like Affectiva detect macro-expressions (left). Micro-expressions (right) are subtle and difficult even for humans to recognize.
5. Translating Between Statistics and Machine Learning
by Zachary T. Kurtz
Statistics and machine learning often use different terminology for similar concepts. I recently confronted this when I began reading about maximum causal entropy as part of a project on inverse reinforcement learning. Many of the terms were unfamiliar to me, but as I read closer, I realized that the concepts had close relationships with statistics concepts. This blog post presents a table of connections between terms that are standard in statistics and their related counterparts in machine learning.
Understanding a result in machine learning can help to avoid reinventing the wheel in statistics and vice versa. My ability to understand inverse reinforcement learning benefited from my training in statistics because I was able to translate machine learning terminology into statistical terminology. Translation takes effort, however, and my research would have proceeded even more smoothly if translation were not required. This experience motivated me to compile a table of common statistics and machine learning terms and connections between them.
4. Best Practices for Cloud Security
by Tim Morrow and Donald Faatz
As detailed in last week's post, SEI researchers recently identified a collection of vulnerabilities and risks faced by organizations moving data and applications to the cloud. In this blog post, we outline best practices that organizations should use to address the vulnerabilities and risks in moving applications and data to cloud services.
These practices are geared toward small and medium-sized organizations; however, all organizations, independent of size, can use these practices to improve the security of their cloud usage. It is important to note that these best practices are not complete and should be complemented with practices provided by cloud service providers, general best cybersecurity practices, regulatory compliance requirements, and practices defined by cloud trade associations, such as the Cloud Security Alliance.
As we stressed in our previous post, organizations must perform due diligence before moving data or applications to the cloud. Cloud service providers (CSPs) use a shared responsibility model for security. The CSP accepts responsibility for some aspects of security. Other aspects of security are shared between the CSP and the consumer or remain the sole responsibility of the consumer.
This post details four important practices, and specific actions, that organizations can use to feel secure in the cloud.
3. Security Begins at the Home Router
by Vijay Sarvepalli
In recent days, the VPNFilter malware has attracted attention, much of it in the wake of a May 25 public service announcement from the FBI, as well as a number of announcements from vendors and security companies. In this blog post, I examine the VPNFilter malware attack by analyzing the vulnerabilities at play, how they were exploited, and the impact on the Internet. I also outline recommendations for the next generation of small Internet of Things (IoT) device manufacturers, including home routers, which were the target of VPNFilter malware. Because this post also emphasizes the prioritization of vulnerabilities that have significant or large-scale impact, I will recap recommendations made in the March 2017 blog post on the Mirai botnet.
A Tale of Vulnerabilities
Cisco's threat blog post on VPNFilter provides details on the devices impacted by this vulnerability, which affected "at least 500K networking devices worldwide." VPNFilter in some ways is like Mirai in that it targets IoT devices, specifically home routers. It is also now known that Mirai botnet was powered by four zero-day exploits apart from the typical password guessing method of compromising small IoT devices. Even recently a newer version of Mirai botnet was discovered by Palo Alto networks that target home routers.
Among the reported router vendors impacted by VPNFilter, Linksys and Netgear caught my attention because, according to Statistica, these popular home routers make up 77 percent of the home router market. By exploring typical vulnerabilities using these two highly popular devices in the home market, I was able to define some broad recommendations that will reduce the risk from such unmanaged and unnoticed devices.
2. 10 Types of Application Security Testing Tools: When and How to Use Them
by Tom Scanlon
Bugs and weaknesses in software are common: 84 percent of software breaches exploit vulnerabilities at the application layer. The prevalence of software-related problems is a key motivation for using application security testing (AST) tools. With a growing number of application security testing tools available, it can be confusing for information technology (IT) leaders, developers, and engineers to know which tools address which issues. This blog post, the first in a series on application security testing tools, will help to navigate the sea of offerings by categorizing the different types of AST tools available and providing guidance on how and when to use each class of tool.
See the second post in this series, Decision-Making Factors for Selecting Application Security Testing Tools.
Application security is not a simple binary choice, whereby you either have security or you don't. Application security is more of a sliding scale where providing additional security layers helps reduce the risk of an incident, hopefully to an acceptable level of risk for the organization. Thus, application-security testing reduces risk in applications, but cannot completely eliminate it. Steps can be taken, however, to remove those risks that are easiest to remove and to harden the software in use.
1. 12 Risks, Threats, and Vulnerabilities in Moving to the Cloud by Donald Faatz and Tim Morrow
Organizations continue to develop new applications in or migrate existing applications to cloud-based services. The federal government recently made cloud-adoption a central tenet of its IT modernization strategy. An organization that adopts cloud technologies and/or chooses cloud service providers (CSP)s and services or applications without becoming fully informed of the risks involved exposes itself to a myriad of commercial, financial, technical, legal, and compliance risks. In this blog post, we outline 12 risks, threats, and vulnerabilities that organizations face when moving application or data to the cloud. In our follow-up post, Best Practices for Cloud Security, we explore a series of best practices aimed at helping organizations securely move data and applications to the cloud.
We would like to note that the threats and vulnerabilities involved in migrating to the cloud are ever-evolving, and the ones listed here are by no means exhaustive. It is important to consider other challenges and risks associated with cloud adoption specific to their missions, systems, and data.
Looking Ahead in 2019
In the coming months, look for posts highlighting our work in deep learning, threat modeling, and secure coding. A new post is published on the SEI Blog every Monday morning.
Download the latest publications from SEI researchers at our digital library