Collaborative Botnet Detection through Large-scale Network Traffic

January 31, 2022 • Video

Fabien Bignon, Bo Hu, and Karel Mittig delivered this presentation at FloCon 2022 on January 11, 2022. Watch the video and download the slides.

Publisher

Software Engineering Institute

Topic or Tag

Flocon

Watch

Abstract

Botnets have become a serious threat to the whole Internet. As a common goal, many network operators and ISPs have developed their own techniques to detect botnets. To comprehensively detect botnets, a collaborative approach to involve multiple network operators and ISPs is required. However, it is difficult to align with different types of techniques and intelligence among operators and ISPs. Existing collaborations are limited to information exchanges.

This presentation shares new approaches to cybersecurity collaborations with attendees. Through the collaborations, we hope attendees can unite to enrich and extend threat intelligence together for more comprehensive detection.

A botnet is a group of malware-infected devices which are used for various cyber attacks. Recently, botnets have evolved to employ globally distributed architectures. To comprehensively detect botnets, collaborations among multiple ISPs are required. However, this is challenging, since each ISP has different techniques and threat intelligence (TI). Existing collaborations are limited to information exchanges.

As joint experiments, NTT and Orange introduce new approaches for deeper collaborations among ISPs. The approaches enrich and extend TI with different traffic data mutually while preserving the privacy and confidentiality of communications.

At NTT, machine learning (ML) is promising for traffic analysis and botnet detection. However, when applying conventional techniques to real-world traffic, we faced the challenges of the sheer amount of input data and the wide variety of output features to represent global access patterns. To address the challenges, NTT has developed Piper, an ML pipeline that consolidates diversified ML applications efficiently. With proposed sampling techniques and global traffic features, Piper can process traffic data faster and detect malicious hosts more accurately.

At Orange, TI has become a major concern, necessary to protect the core network as well as enterprise and consumer markets. While big data offers major advances in this area, it also raises key issues regarding the ability to accurately contextualize and calculate the lifespan of threats for real-time detection. To cope with these issues, the Voodoo project proposes a novel hybrid approach that combines several techniques (fast-data analysis, machine/deep learning, expert system and reinforcement learning). Using historical and real-time threat data, Voodoo trains models that are used to track malicious activity and proactively discover malicious servers on the Internet. The hybrid AI then filters and produces valuable TI, which can be exported as a network indicator threat stream.

Details of new approaches are as follows.
-Exchange: Orange shares Voodoo’s malicious IP addresses with NTT.
-Enrich: After checking third-party reputations, NTT uses Piper to extract traffic-based behavioral features (e.g., bytes/packet) for shared malicious IP addresses, and send features back to Voodoo for enrichment.
-Predict: NTT uses Voodoo’s malicious IP addresses as labels and extracts traffic-based features to build a classifier and predict unknown IP addresses. After double-checking, NTT shares new malicious IP addresses back to Voodoo.

The feedback from NTT will be used by Voodoo to improve the models generated and the data produced through reinforcement learning, thus leading to a virtuous circle of collaboration.

Software Engineering Institute