Assets Related to Artificial Intelligence Engineering

Evaluating LLMs for Text Summarization: An Introduction

Deploying LLMs without human supervision and evaluation can lead to significant errors. This post outlines the fundamentals of LLM evaluation for text summarization in high-stakes applications.

April 7, 2025•By Shannon Gallagher, Swati Rallapalli, Tyler Brooks

In Artificial Intelligence Engineering

The Essential Role of AISIRT in Flaw and Vulnerability Management

The SEI established the first Artificial Intelligence Security Incident Response Team (AISIRT) in 2023. This post discusses AISIRT's role in the coordination of flaws and vulnerabilities in AI systems.

March 26, 2025•By Lauren McIlvenny, Vijay S. Sarvepalli

In Artificial Intelligence Engineering

Enhancing Machine Learning Assurance with Portend

This post introduces Portend, a new open source toolset that simulates data drift in machine learning models and identifies the proper metrics to detect drift in production environments.

March 24, 2025•By Jeffrey Hansen, Sebastián Echeverría, Lena Pons, Gabriel Moreno, Grace Lewis, Lihan Zhan

In Artificial Intelligence Engineering

Protecting AI from the Outside In: The Case for Coordinated Vulnerability Disclosure

This post highlights lessons learned from applying the coordinated vulnerability disclosure (CVD) process to reported vulnerabilities in AI and ML systems.

February 24, 2025•By Allen D. Householder, Vijay S. Sarvepalli, Jeff Havrilla, Matt Churilla, Lena Pons, Shing-hon Lau, Nathan M. VanHoudnos, Andrew Kompanek, Lauren McIlvenny

In Artificial Intelligence Engineering

Introducing MLTE: A Systems Approach to Machine Learning Test and Evaluation

Machine learning systems are notoriously difficult to test. This post introduces Machine Learning Test and Evaluation (MLTE), a new process and tool to mitigate this problem and create safer, more …

February 17, 2025•By Alex Derr, Sebastián Echeverría, Katherine R. Maffey (AI Integration Center, U.S. Army), Grace Lewis

In Artificial Intelligence Engineering

The Myth of Machine Learning Non-Reproducibility and Randomness for Acquisitions and Testing, Evaluation, Verification, and Validation

A reproducibility challenge faces machine learning (ML) systems today. This post explores configurations that increase reproducibility and provides recommendations for these challenges.

January 13, 2025•By Andrew O. Mellinger, Daniel Justice, Marissa Connor, Shannon Gallagher, Tyler Brooks

In Artificial Intelligence Engineering

Beyond Capable: Accuracy, Calibration, and Robustness in Large Language Models

For any organization seeking to responsibly harness the potential of large language models, we present a holistic approach to LLM evaluation that goes beyond accuracy.

December 3, 2024•By Matthew Walsh, David Schulker, Shing-hon Lau

In Artificial Intelligence Engineering

GenAI for Code Review of C++ and Java

Would ChatGPT-3.5 and ChatGPT-4o correctly identify errors in noncompliant code and correctly recognize compliant code as error-free?

November 18, 2024•By David Schulker

In Artificial Intelligence Engineering

Introduction to MLOps: Bridging Machine Learning and Operations

Machine learning operations (MLOps) has emerged as a critical discipline in artificial intelligence and data science. This post introduces MLOps and its applications.

November 4, 2024•By Daniel DeCapria

In Artificial Intelligence Engineering

Measuring AI Accuracy with the AI Robustness (AIR) Tool

Understanding your artificial intelligence (AI) system’s predictions can be challenging. In this post, SEI researchers discuss a new tool to help improve AI classifier performance.

September 30, 2024•By Michael D. Konrad, Nicholas Testa, Linda Parker Gates, Crisanne Nolan, David James Shepard, Julie B. Cohen, Andrew O. Mellinger, Suzanne Miller, Melissa Ludwick

In Artificial Intelligence Engineering

Software Engineering Institute

SEI Blog

Artificial Intelligence Engineering

Blog Posts

Evaluating LLMs for Text Summarization: An Introduction

April 7, 2025•By Shannon Gallagher, Swati Rallapalli, Tyler Brooks

In Artificial Intelligence Engineering

The Essential Role of AISIRT in Flaw and Vulnerability Management

March 26, 2025•By Lauren McIlvenny, Vijay S. Sarvepalli

In Artificial Intelligence Engineering

Enhancing Machine Learning Assurance with Portend

March 24, 2025•By Jeffrey Hansen, Sebastián Echeverría, Lena Pons, Gabriel Moreno, Grace Lewis, Lihan Zhan

In Artificial Intelligence Engineering

Protecting AI from the Outside In: The Case for Coordinated Vulnerability Disclosure

February 24, 2025•By Allen D. Householder, Vijay S. Sarvepalli, Jeff Havrilla, Matt Churilla, Lena Pons, Shing-hon Lau, Nathan M. VanHoudnos, Andrew Kompanek, Lauren McIlvenny

In Artificial Intelligence Engineering

Introducing MLTE: A Systems Approach to Machine Learning Test and Evaluation

February 17, 2025•By Alex Derr, Sebastián Echeverría, Katherine R. Maffey (AI Integration Center, U.S. Army), Grace Lewis

In Artificial Intelligence Engineering

The Myth of Machine Learning Non-Reproducibility and Randomness for Acquisitions and Testing, Evaluation, Verification, and Validation

January 13, 2025•By Andrew O. Mellinger, Daniel Justice, Marissa Connor, Shannon Gallagher, Tyler Brooks

In Artificial Intelligence Engineering

Beyond Capable: Accuracy, Calibration, and Robustness in Large Language Models

December 3, 2024•By Matthew Walsh, David Schulker, Shing-hon Lau

In Artificial Intelligence Engineering

GenAI for Code Review of C++ and Java

November 18, 2024•By David Schulker

In Artificial Intelligence Engineering

Introduction to MLOps: Bridging Machine Learning and Operations

November 4, 2024•By Daniel DeCapria

In Artificial Intelligence Engineering

Measuring AI Accuracy with the AI Robustness (AIR) Tool

September 30, 2024•By Michael D. Konrad, Nicholas Testa, Linda Parker Gates, Crisanne Nolan, David James Shepard, Julie B. Cohen, Andrew O. Mellinger, Suzanne Miller, Melissa Ludwick

In Artificial Intelligence Engineering

Related Tags