When Threat Hunting Fails: Identifying Malvertising Domains Using Lexical Clustering

January 30, 2018 • Presentation

By

Matt Foley (Cisco Systems, Inc.), David Rodriguez (Cisco Systems, Inc.), and Dhia Mahjoub (OpenDNS)

In this presentation, the authors discuss the current malvertising threat landscape: ad networks, exchanges, exploits, and popular infection points.

Publisher

Cisco Systems, Inc.

Topic or Tag

Situational Awareness

Abstract

In this presentation, the authors introduce a real-time streaming pipeline built in Kafka to stem the initial attack that is observable in DNS logs by using a scalable clustering technique known as locality sensitive hashing (LSH) over the hostnames to identify the permutations of words and characters from “software”, “update”, “tech”, “support”, and more. We then discuss a novel belief propagation algorithm through a client-hostname bipartite graph that propagates up the related file hosts that lay behind malicious advertisements. Finally, we will disclose the anatomy of a malicious advertising campaign and uncover how the file hosts are often reused in malvertising campaigns.

Software Engineering Institute