Automating Mismatch Detection in ML Systems
Created October 2022 • Updated April 2025
Often, organizations face challenges in deploying mission-critical systems, such as those for national security, that rely on Machine Learning (ML) capabilities. One of those challenges comes from ML Mismatch that occurs in the development, deployment, or operation of an ML-enabled system due to incorrect assumptions made by different stakeholders. ML Mismatch will result in unnecessary rework, project delay, poor ML component performance, incompatible interfaces between components, system failure, or other negative consequences.
The SEI TEC ML Mismatch Detection Tool (TEC) allows users to spot and avoid ML Mismatch early in the ML system development cycle.
Teams in ML System Development Face Communication Difficulties
In ML system development, three disciplines with different perspectives and workflows share responsibility for developing and deploying the system. Data science or ML engineering teams build the model; software engineering teams integrate the model into an ML system; and operations staff deploy, operate, and monitor the system in production. When the teams do not have a shared system context, their assumptions enable ML Mismatch to occur.
An ML Mismatch can be traced back to information that would have avoided the problem if shared between stakeholders. In previous work, the SEI team conducted an empirical study to identify and characterize ML Mismatch. When discovered late in development or after deployment, ML Mismatch is more expensive, in lost time and higher cost. In that work, the SEI team collected 140 mismatch examples, which resulted in 232 instances of information that was not communicated and led to a mismatch. The SEI team identified eight groups of mismatches (e.g., training data, operational environment). For each group, SEI researchers specified an extensible, machine-readable, descriptor using JSON Schema, a popular model for exchanging data.
TEC ML Mismatch Detection Tool Enables ML System Teams to Avoid Costly Mismatches
The SEI developed the TEC tool that enables different stakeholders to record descriptor information and run an analysis that automatically detects potential mismatches. As a result, all stakeholders make decisions based on the same set of descriptors and share their expectations with the other teams, such as required computing resources or runtime metrics.
Armed with the TEC tool’s analysis, stakeholders identify—early in the process—system attributes (e.g., available computing resources) that need to align in order to prevent mismatches and resolve problematic differences. The result is less rework, fewer delays, and better ML system performance.
Software and Tools
TEC ML Mismatch Detection Tool
The TEC tool compares information across descriptors and flags any mismatches or missing information.
Learn MoreLooking Ahead
The SEI invites users to download the TEC tool, apply it in an ML system development project and provide feedback. Contact us, also, if your organization
- wants to incorporate the TEC tool and its accompanying best practices into your ML system development workflows
- notices important features to add to the tool. Users are free to modify the open source TEC software and share those modifications with the SEI
Learn More
Software Engineering for Machine Learning: Characterizing and Detecting Mismatch in Machine-Learning Systems
•Blog Post
This post describes how we are creating and assessing empirically validated practices to guide the development of machine-learning-enabled systems.
READDetecting Mismatches in Machine-Learning Systems
•Blog Post
The use of machine learning (ML) could improve many business functions and meet many needs for organizations. For example, ML capabilities can be used to suggest products to users based on purchase history....
READ