icon-carat-right menu search cmu-wordmark

Amateur Hour to Alerting Power: Overcoming Challenges in Constructing a PII Observability Pipeline

Presentation
This session was presented by Malav Modi of WeightWatchers at DevSecOps Days Washington D.C., held on Thursday, October 12.
Publisher

Software Engineering Institute

Abstract

In today's big-data driven world, ensuring the confidentiality, integrity, and availability of sensitive information through its vast mediums is extremely important. In recent years, PII identification and triage at scale have experienced notable advancements, but not without their various flaws when breaking down different business contexts.

This talk proposes an engaging and informative presentation about the story of both the trials and tribulations of trying to develop an internal pipeline designed to identify Personal Identifiable Information (PII) in voice transcriptions. Concurrently, it will also explore and highlight the current state of data loss prevention techniques through their respective AI and traditional pattern-based methodologies.

Within the context of a relatable business scenario, I will take a deep dive from problem to solution, acknowledging and incorporating iterative processes of the programmatic architecture. This includes but is not limited to addressing scalability, flexibility, and alerting considerations. Compliance with regulatory requirements is emphasized, along with the importance of proactive analysis and triage upon discovery of PII data.

Attendees will leave this presentation with valuable insights into the intricacies of PII identification and data loss prevention. Through a deep dive into a small-scale project, participants will gain practical knowledge and a comprehensive understanding of the challenges, solutions, and trade-offs involved.

In conclusion, this presentation serves as a TLDR for attendees to enhance their understanding of PII identification and data loss prevention. By delving into a small-scale project and examining the intricacies of data handling, integration of advanced techniques, and the broader ecosystem of business context factors, participants are equipped with both practical knowledge and a comprehensive perspective of tackling an industry problem in a unique way.