Guppy: A Scalable Security Data Lake
Software Engineering Institute
This presentation was given at FloCon 2023, an annual conference that focuses on applying any and all collected data to defend enterprise networks.
Managing security big data is becoming an emerging challenge for all large-scale enterprises globally. Security data generated by enterprise assets are usually needed by multiple enterprise security platforms (including SIEM, analytic, forensic, governance, and compliance tools). Security Data Lake (SDL) can handle such requirements by facilitating seamless security data collection, analysis, and information sharing between different enterprise security tentacles. Adoption of Security Data Lakes (SDL) shall be a key requirement for the threat hunters in order to run machine learning and deep learning algorithms on security datasets, to support their threat hunting operations. This paper explains a generic and scalable Security Data Lake framework (which we named Guppy) that is tuned to handle all types of security data (events, logs, alerts) using available open-source industry tools, and can integrate with leading industry ML/AI-enabled security platforms.
Attendees will be able to understand why they need to start thinking of replacing their basic log management solutions with modern security data lakes to supercharge the machine learning based threat hunting operations.