search menu icon-carat-right cmu-wordmark

Subject: Artificial Intelligence and Machine Learning

Detecting Mismatches in Machine-Learning Systems

Detecting Mismatches in Machine-Learning Systems

• SEI Blog
Grace Lewis

The use of machine learning (ML) could improve many business functions and meet many needs for organizations. For example, ML capabilities can be used to suggest products to users based on purchase history; provide image recognition for video surveillance; identify spam email messages; and predict courses of action, routes, or diseases, among others. However, in most organizations today (with the exception of large high-tech companies, such as Google and Microsoft), development of ML capabilities is...

Read More
Three Risks in Building Machine Learning Systems

Three Risks in Building Machine Learning Systems

• SEI Blog
Benjamin Cohen

Machine learning (ML) systems promise disruptive capabilities in multiple industries. Building ML systems can be complicated and challenging, however, especially since best practices in the nascent field of AI engineering are still coalescing. Consequently, a surprising fraction of ML projects fail or underwhelm. Behind the hype, there are three essential risks to analyze when building an ML system: 1) poor problem solution alignment, 2) excessive time or monetary cost, and 3) unexpected behavior once deployed....

Read More
Automatically Detecting Technical Debt Discussions with Machine Learning

Automatically Detecting Technical Debt Discussions with Machine Learning

• SEI Blog
Robert Nord

Technical debt (TD) refers to choices made during software development that achieve short-term goals at the expense of long-term quality. Since developers use issue trackers to coordinate task priorities, issue trackers are a natural focal point for discussing TD. In addition, software developers use preset issue types, such as feature, bug, and vulnerability, to differentiate the nature of the task at hand. We have recently started seeing developers explicitly use the phrase "technical debt" or...

Read More
Using Machine Learning to Detect Design Patterns

Using Machine Learning to Detect Design Patterns

• SEI Blog
Robert Nord

This post was co-written by Zachary Kurtz. Software increasingly serves core DoD functions, such as ship and plane navigation, supply logistics, and real-time situational awareness. The complexity of software, however, makes it hard to evaluate software quality. The ability to evaluate software is critical both for software developers and for DoD program managers who are responsible for software acquisitions. The quality of software can make or break a program budget. Quality attributes such as reliability,...

Read More
Summarizing and Searching Video with Machine Learning

Summarizing and Searching Video with Machine Learning

• SEI Blog
Edwin Morris

The U.S. relies on surveillance video to determine when activities of interest occur in a location that is under surveillance. Yet, because automated tools are not available to help analysts monitor real-time video or analyze archived video, analysts must dedicate full attention to video data streams to avoid missing important information about ongoing activities and patterns of life. In tactical settings, warfighters miss critical information that would improve situational awareness because dedicating full attention to...

Read More
Artificial Intelligence in Practice: Securing Your Code Using Natural Language Processing

Artificial Intelligence in Practice: Securing Your Code Using Natural Language Processing

• SEI Blog
Eliezer Kanal

Many techniques are available to help developers find bugs in their code, but none are perfect: an adversary needs only one to cause problems. In this post, I'll discuss how a branch of artificial intelligence called natural language processing, or NLP, is being applied to computer code and cybersecurity. NLP is how machines extract information from naturally occurring language, such as written prose or transcribed speech. Using NLP, we can gain insight into the code...

Read More
The Latest Work from the SEI: AI, Deepfakes, Automated Alert Handling, and Cyber Intelligence

The Latest Work from the SEI: AI, Deepfakes, Automated Alert Handling, and Cyber Intelligence

• SEI Blog
Douglas C. Schmidt

As part of an ongoing effort to keep you informed about our latest work, this blog post summarizes some recently published SEI reports, podcasts, and presentations highlighting our work in artificial intelligence, STEM careers, deepfakes, automated alert handling (here and here), systems and software engineering, and cyber intelligence. These publications highlight the latest work of SEI technologists in these areas. The SEI also made available an online version of the 2018 Year in Review, which...

Read More
The Promise of Deep Learning on Graphs

The Promise of Deep Learning on Graphs

• SEI Blog
Oren Wright

A growing number of Department of Defense (DoD) data problems are graph problems: the data from sources such as sensor feeds, web traffic, and supply chains are full of irregular relationships that require graphs to represent explicitly and mathematically. For example, modern test and evaluation produces massive, heterogeneous datasets, and analysts can use graphs to reveal otherwise hidden patterns in these data, affording the DoD a far more complete understanding of a system's effectiveness, survivability,...

Read More
The Vectors of Code: On Machine Learning for Software

The Vectors of Code: On Machine Learning for Software

• SEI Blog
Zachary Kurtz

This blog post provides a light technical introduction on machine learning (ML) for problems of computer code, such as detecting malicious executables or vulnerabilities in source code. Code vectors enable ML practitioners to tackle code problems that were previously approachable only with highly-specialized software engineering knowledge. Conversely, code vectors can help software analysts to leverage general, off-the-shelf ML tools without needing to become ML experts. In this post, I introduce some use cases for ML...

Read More
Deep Learning and Satellite Imagery: DIUx Xview Challenge

Deep Learning and Satellite Imagery: DIUx Xview Challenge

• SEI Blog
Ritwik Gupta

In 2017 and 2018, the United States witnessed a milestone year of climate and weather-related disasters from droughts and wildfires to cyclones and hurricanes. Increasingly, satellites are playing an important role in helping emergency responders assess the damage of a weather event and find victims in its aftermath. Most recently satellites have tracked the devastation wrought by the California wildfires from space. The United States military, which is often the first on the scene of...

Read More
Improving Assessments for Cybersecurity Training

Improving Assessments for Cybersecurity Training

• SEI Blog
April Galyardt

The CERT Cyber Workforce Development Directorate conducts training in cyber operations for the DoD and other government customers as part of its commitment to strengthen the nation's cybersecurity workforce. A part of this work is to develop capabilities that better enable DoD cyber forces to "to train as you fight" such as setting up high-fidelity simulation environments for cyber forces to practice skills including network defense, incident response, digital forensics, etc. However, cybersecurity is a...

Read More
Translating Between Statistics and Machine Learning

Translating Between Statistics and Machine Learning

• SEI Blog
Zachary Kurtz

Statistics and machine learning often use different terminology for similar concepts. I recently confronted this when I began reading about maximum causal entropy as part of a project on inverse reinforcement learning. Many of the terms were unfamiliar to me, but as I read closer, I realized that the concepts had close relationships with statistics concepts. This blog post presents a table of connections between terms that are standard in statistics and their related counterparts...

Read More
Deep Learning: Going Deeper toward Meaningful Patterns in Complex Data

Deep Learning: Going Deeper toward Meaningful Patterns in Complex Data

• SEI Blog
Carson Sestili

In a previous blog post, we addressed how machine learning is becoming ever more useful in cybersecurity and introduced some basic terms, techniques, and workflows that are essential for those who work in machine learning. Although traditional machine learning methods are already successful for many problems, their success often depends on choosing and extracting the right features from a dataset, which can be hard for complex data. For instance, what kinds of features might be...

Read More
Video Summarization: Using Machine Learning to Process Video from Unmanned Aircraft Systems

Video Summarization: Using Machine Learning to Process Video from Unmanned Aircraft Systems

• SEI Blog
Kevin Pitstick

As the use of unmanned aircraft systems (UASs) increases, the volume of potentially useful video data that UASs capture on their missions is straining the resources of the U.S. military that are needed to process and use this data. This publicly released video is an example of footage captured by a UAS in Iraq. The video shows ISIS fighters herding civilians into a building. U.S. forces did not fire on the building because of the...

Read More
Revealing True Emotions Through Micro-Expressions: A Machine Learning Approach

Revealing True Emotions Through Micro-Expressions: A Machine Learning Approach

• SEI Blog
Satya Venneti

Micro-expressions--involuntary, fleeting facial movements that reveal true emotions--hold valuable information for scenarios ranging from security interviews and interrogations to media analysis. They occur on various regions of the face, last only a fraction of a second, and are universal across cultures. In contrast to macro-expressions like big smiles and frowns, micro-expressions are extremely subtle and nearly impossible to suppress or fake. Because micro-expressions can reveal emotions people may be trying to hide, recognizing micro-expressions can...

Read More
Machine Learning in Cybersecurity

Machine Learning in Cybersecurity

• SEI Blog
Eliezer Kanal

The year 2016 witnessed advancements in artificial intelligence in self-driving cars, language translation, and big data. That same time period, however, also witnessed the rise of ransomware, botnets, and attack vectors as popular forms of malware attack, with cybercriminals continually expanding their methods of attack (e.g., attached scripts to phishing emails and randomization), according to Malware Byte's State of Malware report. To complement the skills and capacities of human analysts, organizations are turning to machine...

Read More
What Ant Colonies Can Teach Us About Securing the Internet

What Ant Colonies Can Teach Us About Securing the Internet

• SEI Blog
William Casey

In cyber systems, the identities of devices can easily be spoofed and are frequent targets of cyber-attacks. Once an identity is fabricated, stolen or spoofed it may be used as a nexus to systems, thus forming a Sybil Attack. To address these and other problems associated with identity deception researchers at the Carnegie Mellon University Software Engineering Institute, New York University's Tandon School of Engineering and Courant Institute of Mathematical Sciences, and the University of...

Read More
Using Machine Learning to Detect Malware Similarity

Using Machine Learning to Detect Malware Similarity

• SEI Blog
Sagar Chaki

Malware, which is short for "malicious software," consists of programming aimed at disrupting or denying operation, gathering private information without consent, gaining unauthorized access to system resources, and other inappropriate behavior. Malware infestation is of increasing concern to government and commercial organizations. For example, according to the Global Threat Report from Cisco Security Intelligence Operations, there were 287,298 "unique malware encounters" in June 2011, double the number of incidents that occurred in March. To help...

Read More