The Security Wolf of Wall Street: Fighting Crime with High-Frequency Classification and Natural Language Processing
In a world where threat actors move fast and the Internet evolves in a nondeterministic fashion, turning threat intelligence into automated protection has proven to be a challenge for the information security industry. While traditional threat research methods will never go away, there is an increasing need for powerful decision models that can process data in a real-time fashion and scale to incorporate increasingly rich sources of threat intel. In this presentation, given at FloCon 2016, the authors focus on one way to build a scalable machine learning infrastructure in real-time on a massive amount of DNS data (approximately 80B queries per day). The authors offer a sneak peek into how OpenDNS does scalable data science and touch on two core components, Big Data engineering and Big Data science, and specifically discuss how they are used to implement realtime threat detection systems for largescale network traffic.