Using Generative Adversarial Networks to Improve Phishing Domain Classifiers
As machine learning classifiers are increasingly deployed for defensive cybersecurity purposes, there is a growing interest in using adversarial machine learning to allow for the safe use of these classifiers. One area of focus is on building classifiers that are robust to evasion attacks, where evasion attacks are adversarial examples specifically crafted to defeat a machine learning model.
In this presentation, the authors explore the use of generative adversarial networks (GANs) to construct synthetic phishing domains as potential evasion attacks, and test the value of including these generated domains in the training set of a machine learning classifier designed to correctly label phishing and non-phishing domains. Specifically, the authors test the hypothesis that by training a classifier on an augmented set of data that includes generated domains, they built a more robust classifier for the task of identifying phishing domains. To perform this testing, the authors construct several random forest classifiers, all of which use the same set of hand-engineered features.