Blog Posts
Measuring AI Accuracy with the AI Robustness (AIR) Tool
Understanding your artificial intelligence (AI) system’s predictions can be challenging. In this post, SEI researchers discuss a new tool to help improve AI classifier performance.
• By Michael D. Konrad, Nicholas Testa, Linda Parker Gates, Crisanne Nolan, David James Shepard, Julie B. Cohen, Andrew O. Mellinger, Suzanne Miller, Melissa Ludwick
In Artificial Intelligence Engineering
Generative AI and Software Engineering Education
Educators have had to adapt to rapid developments in generative AI to provide a realistic perspective to their students. In this post, experts discuss generative AI and software engineering education.
• By Ipek Ozkaya, Douglas Schmidt (Vanderbilt University)
In Artificial Intelligence Engineering
3 Recommendations for Machine Unlearning Evaluation Challenges
Machine unlearning (MU) aims to develop methods to remove data points efficiently and effectively from a model without the need for extensive retraining. This post details our work to address …
• By Keltin Grimes, Collin Abidi, Cole Frank, Shannon Gallagher
In Artificial Intelligence Engineering
Weaknesses and Vulnerabilities in Modern AI: AI Risk, Cyber Risk, and Planning for Test and Evaluation
Modern AI systems pose consequential, poorly understood risks. This blog post explores strategies for framing test and evaluation practices based on a holistic approach to AI risk.
• By Bill Scherlis
In Artificial Intelligence Engineering
Weaknesses and Vulnerabilities in Modern AI: Integrity, Confidentiality, and Governance
In the rush to develop AI, it is easy to overlook factors that increase risk. This post explores AI risk through the lens of confidentiality, governance, and integrity.
• By Bill Scherlis
In Artificial Intelligence Engineering
Weaknesses and Vulnerabilities in Modern AI: Why Security and Safety Are so Challenging
This post explores concepts of security and safety for neural-network-based AI, including ML and generative AI, as well as AI-specific challenges in developing safe and secure systems.
• By Bill Scherlis
In Artificial Intelligence Engineering
Auditing Bias in Large Language Models
This post discusses recent research that uses a role-playing scenario to audit ChatGPT, an approach that opens new possibilities for revealing unwanted biases.
• By Katherine-Marie Robinson, Violet Turri
In Artificial Intelligence Engineering
Cost-Effective AI Infrastructure: 5 Lessons Learned
This post details challenges and state of the art of cost-effective AI infrastructure and five lessons learned for standing up an LLM.
• By William Nichols, Bryan Brown
In Artificial Intelligence Engineering
Applying Large Language Models to DoD Software Acquisition: An Initial Experiment
This SEI Blog post illustrates examples of using LLMs for software acquisition in the context of a document summarization experiment and codifies the lessons learned from this experiment and related …
• By Douglas Schmidt (Vanderbilt University), John E. Robert
In Artificial Intelligence Engineering
OpenAI Collaboration Yields 14 Recommendations for Evaluating LLMs for Cybersecurity
This SEI Blog post summarizes 14 recommendations to help assessors accurately evaluate LLM cybersecurity capabilities.