Out of Distribution Detection: Knowing When AI Doesn’t Know
PUBLISHED IN
Artificial Intelligence EngineeringImagine a military surveillance system trained to identify specific vehicles in desert environments. One day, this system is deployed in a snowy mountain region and begins misidentifying civilian vehicles as military targets. Or consider an artificial intelligence (AI) medical diagnosis system for battlefield injuries that encounters a novel type of wound it was never trained on, but it confidently—and incorrectly—recommends a standard treatment protocol.
These scenarios highlight a critical challenge in artificial intelligence: how do we know when an AI system is operating outside its intended knowledge boundaries? This is the critical domain of out-of-distribution (OoD) detection—identifying when an AI system is facing situations it wasn't trained to handle. Through our work here in the SEI’s AI Division, particularly in collaborating with the Office of the Under Secretary of Defense for Research and Engineering (OUSD R&E) to establish the Center for Calibrated Trust Measurement and Evaluation (CaTE), we’ve seen firsthand the critical challenges facing AI deployment in defense applications.
The two scenarios detailed above aren’t hypothetical—they represent the kind of challenges we encounter regularly in our work helping the Department of Defense (DoD) ensure AI systems are safe, reliable, and trustworthy before being fielded in critical situations. As this post details, this is why we’re focusing on OoD detection: the crucial capability that allows AI systems to recognize when they’re operating outside their knowledge boundaries.
Why Out-of-Distribution Detection Matters
For defense applications, where decisions can have life-or-death consequences, knowing when an AI system might be unreliable is just as important as its accuracy when it is working correctly. Consider these scenarios:
- autonomous systems that need to recognize when environmental conditions have changed significantly from their training data
- intelligence analysis tools that should flag unusual patterns, not force-fit them into known categories
- cyber defense systems that must identify novel attacks, not just those seen previously
- logistics optimization algorithms that should detect when supply chain conditions have fundamentally changed
In each case, failing to detect OoD inputs could lead to silent failures with major consequences. As the DoD continues to incorporate AI into mission-critical systems, OoD detection becomes a cornerstone of building trustworthy AI.
What Does Out-of-Distribution Really Mean?
Before diving into solutions, let's clarify what we mean by out-of-distribution. Distribution refers to the distribution of the data that the model was trained on. However, it's not always clear what makes something out of a distribution.
In the simplest case, we might say new input data is OoD if it would have zero probability of appearing in our training data. But this definition rarely works in practice because most commonly used statistical distributions, such as the normal distribution, technically allow for any value, however unlikely. In other words, they have infinite support.
Out-of-distribution typically means one of two things:
- The new input comes from a fundamentally different distribution than the training data. Here, fundamentally different means there is a way of measuring the two distributions as not being the same. In practice, though, a more useful definition is that when a model is trained on one distribution, it performs unexpectedly on the other distribution.
- The probability of seeing this input in the training distribution is extremely low.
For example, a facial recognition system trained on images of adults might consider a child's face to be from a different distribution entirely. Or an anomaly detection system might flag a tank moving at 200 mph as having an extremely low probability in its known distribution of vehicle speeds.
Three Approaches to OoD Detection
Techniques for OoD detection can be broadly categorized in three ways:
1. Data-Only Techniques: Anomaly Detection and Density Estimation
These approaches try to model what normal data looks like without necessarily connecting it to a specific prediction task. Typically this task is done using methods from one of two sub-domains:
1) Anomaly detection aims to identify data points that deviate significantly from what’s considered normal. These techniques can be categorized by their data requirements: supervised approaches that use labeled examples of both normal and anomalous data, semi-supervised methods that primarily learn from normal data with perhaps a few anomalies, and unsupervised techniques that must distinguish anomalies[1] without any explicit labels. Anomalies are defined as data that deviates significantly from the majority of previously observed data. In anomaly detection, deviates significantly is often left up to the assumptions of the technique used.
2) Density estimation involves learning a probability density function of training data that can then be used to assign a probability to any new instance of data. When a new input receives a very low probability, it's flagged as OoD. Density estimation is a classic problem in statistics.
While these approaches are conceptually straightforward and offer several mature techniques for use with low-dimensional, tabular data, they present challenges with the high-dimensional data that can be common in defense applications, such as images or sensor arrays. They also require somewhat arbitrary decisions about thresholds: how “unusual” does something need to be before we call it OoD?
2. Building OoD Awareness into Models
An alternative to the data-only approach is to train a new supervised model specifically to detect OoD instances. There are two popular techniques.
1) Learning with rejection trains models to output a special "I don't know" or “reject” response when they are uncertain. This is similar to how a human analyst might flag a case for further review rather than make a hasty judgment.
2) Uncertainty-aware models like Bayesian neural networks and ensembles explicitly model their own uncertainty. If the model shows high uncertainty about its parameters for a given input, that input is likely OoD.
While these approaches are theoretically appealing, they often require more complex training procedures and computational resources (For more on this topic see here and here), which can be challenging for deployed systems with size, weight, and power constraints. Such constraints are common in edge environments such as front-line deployments.
3. Adding OoD Detection to Existing Models
Rather than having to train a new model from scratch, the third approach takes advantage of models that have already been trained for a specific task and augments them with OoD detection capabilities.
The simplest version involves thresholding the confidence scores that models already output. If a model's confidence falls below a certain threshold, the input is flagged as potentially OoD. More sophisticated techniques might analyze patterns in the model's internal representations.
These approaches are practical because they work with existing models, but they're somewhat heuristic and may make implicit assumptions that don't hold for all applications.
DoD Applications and Considerations
For defense applications, OoD detection is particularly valuable in several contexts:
- mission-critical autonomy: Autonomous systems operating in contested environments need to recognize when they've encountered conditions they weren't trained for, potentially falling back to more conservative behaviors.
- intelligence processing: Systems analyzing intelligence data need to flag unusual patterns that human analysts should examine, rather than force-fitting them into known categories.
- cyber operations: Network defense systems need to identify novel attacks that don't match patterns of previously seen threats.
- supply chain resilience: Logistics systems need to detect when patterns of demand or supply have fundamentally changed, potentially triggering contingency planning.
For the DoD, several additional considerations come into play:
- resource constraints: OoD detection methods must be efficient enough to run on edge devices with limited computing power.
- limited training data: Many defense applications have limited labeled training data, making it difficult to precisely define the boundaries of the training distribution.
- adversarial threats: Adversaries might deliberately create inputs designed to fool both the main system and its OoD detection mechanisms.
- criticality: Incorrect predictions made by machine learning (ML) models that are presented as confident and correct may have severe consequences in high-stakes missions.
A Layered Approach to Verifying Out-of-Distribution Detection
While OoD detection methods provide a powerful means to assess whether ML model predictions can be unreliable, they come with one important caveat. Any OoD detection technique, either implicitly or explicitly, makes assumptions about what is “normal” data and what is “out-of-distribution" data. These assumptions are often very difficult to verify in real-world applications for all possible changes in deployment environments. It is likely that no OoD detection method will always detect an unreliable prediction.
As such, OoD detection should be considered a last line of defense in a layered approach to assessing the reliability of ML models during deployment. Developers of AI-enabled systems should also perform rigorous test and evaluation, build monitors for known failure modes into their systems, and perform comprehensive analysis of the conditions under which a model is designed to perform versus conditions in which its reliability is unknown.
Looking Forward
As the DoD continues to adopt AI systems for critical missions, OoD detection will be an essential component of ensuring these systems are trustworthy and robust. The field continues to evolve, with promising research directions including
- methods that can adapt to gradually shifting distributions over time
- techniques that require minimal additional computational resources
- approaches that combine multiple detection strategies for greater reliability
- integration with human-AI teaming to ensure appropriate handling of OoD cases
- algorithms based on practically verifiable assumptions about real-world shifts
By understanding when AI systems are operating outside their knowledge boundaries, we can build more trustworthy and effective AI capabilities for defense applications—knowing not just what our systems know, but also what they don't know.
Additional Resources
GitHub - cmu-sei/portend: The Portend Toolset helps generate simulated drift on ML model data and figure out proper metrics to detect it in operation.
Center for Calibrated Trust Measurement and Evaluation (CaTE)—Guidebook for the Development and TEVV of LAWS to Promote Trustworthiness: This guidebook supports personnel in the development and testing of autonomous weapon systems that employ ML, focusing on system reliability and operator trust.
More By The Authors
PUBLISHED IN
Artificial Intelligence EngineeringGet updates on our latest work.
Sign up to have the latest post sent to your inbox weekly.
Subscribe Get our RSS feedGet updates on our latest work.
Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.
Subscribe Get our RSS feed