search menu icon-carat-right cmu-wordmark

Self-Modulating Endpoint Observability

Frederico Araujo and Teryl Taylor delivered this presentation at FloCon 2022 on January 13, 2022. Watch the video and download the slides.

Software Engineering Institute




Modern micro-services leverage native cloud capabilities to automatically manage workload deployment and scalability. In this model, end-to-end visibility becomes more complex, endpoint-centric, and difficult to execute, especially when deployment and upgrades are continuous. Endpoint monitors, therefore, need to adapt to unpredictable workloads without jeopardizing the performance and stability of the production environment.

Typically, these monitors collect system telemetry data, such as application logs or system calls. Collecting system call data is advantageous because it can provide a detailed view of each process running on a host. Unfortunately, this data is often too large to store and analyze in any meaningful way, forcing practitioners to write complex filtering rules to make the telemetry footprint more manageable. Such filters require constant tuning as new applications and updates are deployed, and often result in important contextual attack data being filtered out needlessly. Telemetry formats such as SysFlow make data collection more palatable by lifting raw system call information into a more semantic summarization of system behaviors. However, SysFlow does not maintain enough system state to further reduce noisy, redundant process behaviors that can occlude security analyses and create long-term storage headaches.

In this talk, we describe a new system graph data model that encodes process behaviors into hierarchical summarized views of system activity and demonstrate how such data structure can be used to implement a self-modulating telemetry stream that adapts to the monitored environment and drastically reduces event fatigue, optimizes storage, and provides important contextual information for security investigations.

Attendees Will Learn:

  • Challenges in system telemetry for distributed cloud endpoints
  • A data modeling approach to address event fatigue in system call monitoring
  • Design principles for cloud-native observability

Dr. Frederico Araujo is a Research Scientist at IBM Research, where he leads the team's efforts on cloud-native security. He's an active contributor to open source and a maintainer of the SysFlow project. He's also a contributor to CNCF's Falco project. His work has been featured in top academic and industry security conferences and won several best paper awards, including the Best Applied Security Research Award at CSAW. Fred has over 15 years of professional experience and holds a Ph.D. from the University of Texas, Dallas, where he was awarded the Best Dissertation Award for his research on engineering cyber-deceptive software.

Dr. Teryl Taylor is a Research Staff Member in the Cognitive Cybersecurity Intelligence Group at IBM Research. He has ten years of experience in cybersecurity-related research, including NetFlow-based analytics, system telemetry and analytics, security visualization, and cyber deception. Currently, Teryl is a lead researcher on the SysFlow project and innovates in security techniques for the cloud. He has presented at numerous top security and telemetry conferences, including NDSS, CodaSPY, Usenix ATC, BigData, Flocon and VizSec.


Part of a Collection

FloCon 2022 Assets

This content was created for a conference series or symposium and does not necessarily reflect the positions and views of the Software Engineering Institute.