icon-carat-right menu search cmu-wordmark

Automating Mismatch Detection in ML Systems

Created October 2022 • Updated April 2025

Often, organizations face challenges in deploying mission-critical systems, such as those for national security, that rely on Machine Learning (ML) capabilities. One of those challenges comes from ML Mismatch that occurs in the development, deployment, or operation of an ML-enabled system due to incorrect assumptions made by different stakeholders. ML Mismatch will result in unnecessary rework, project delay, poor ML component performance, incompatible interfaces between components, system failure, or other negative consequences.

The SEI TEC ML Mismatch Detection Tool (TEC) allows users to spot and avoid ML Mismatch early in the ML system development cycle.

Teams in ML System Development Face Communication Difficulties

In ML system development, three disciplines with different perspectives and workflows share responsibility for developing and deploying the system. Data science or ML engineering teams build the model; software engineering teams integrate the model into an ML system; and operations staff deploy, operate, and monitor the system in production. When the teams do not have a shared system context, their assumptions enable ML Mismatch to occur.

An ML Mismatch can be traced back to information that would have avoided the problem if shared between stakeholders. In previous work, the SEI team conducted an empirical study to identify and characterize ML Mismatch. When discovered late in development or after deployment, ML Mismatch is more expensive, in lost time and higher cost. In that work, the SEI team collected 140 mismatch examples, which resulted in 232 instances of information that was not communicated and led to a mismatch. The SEI team identified eight groups of mismatches (e.g., training data, operational environment). For each group, SEI researchers specified an extensible, machine-readable, descriptor using JSON Schema, a popular model for exchanging data.

Automating Mismatch Detection in ML Systems

TEC ML Mismatch Detection Tool Enables ML System Teams to Avoid Costly Mismatches

The SEI developed the TEC tool that enables different stakeholders to record descriptor information and run an analysis that automatically detects potential mismatches. As a result, all stakeholders make decisions based on the same set of descriptors and share their expectations with the other teams, such as required computing resources or runtime metrics.

Armed with the TEC tool’s analysis, stakeholders identify—early in the process—system attributes (e.g., available computing resources) that need to align in order to prevent mismatches and resolve problematic differences. The result is less rework, fewer delays, and better ML system performance.

Automating Mismatch Detection in ML Systems

Software and Tools

TEC ML Mismatch Detection Tool

The TEC tool compares information across descriptors and flags any mismatches or missing information.

Learn More

Looking Ahead

The SEI invites users to download the TEC tool, apply it in an ML system development project and provide feedback. Contact us, also, if your organization

  • wants to incorporate the TEC tool and its accompanying best practices into your ML system development workflows
  • notices important features to add to the tool. Users are free to modify the open source TEC software and share those modifications with the SEI

Learn More

Software Engineering for Machine Learning: Characterizing and Detecting Mismatch in Machine-Learning Systems

Blog Post

This post describes how we are creating and assessing empirically validated practices to guide the development of machine-learning-enabled systems.

READ

Detecting Mismatches in Machine-Learning Systems

Blog Post

The use of machine learning (ML) could improve many business functions and meet many needs for organizations. For example, ML capabilities can be used to suggest products to users based on purchase history....

READ