Posted on by Network Situational Awarenessin
One of my responsibilities on the Situational Awareness Analysis team is to create analytics for various purposes. For the past few weeks, I've been working on some anomaly detection analytics for hunting in the network flow traffic of common network services. I decided to start with a very simple approach using mean and standard deviation for a historical period to create a profile that I could compare against current volumes. To do this, I planned on binning network traffic by some length of time to find time periods with anomalous volumes. The question I then had to answer was, "How should I define the historical period?" In this post, I explain the process I used to answer that question.
To define the historical period, I needed to answer several questions:
The range of possible lengths of a time bin depends on the granularity of how your network flow collector specifies time--from milliseconds to weeks to months to even a year. However, none of these measures makes much sense in practice. I decided I would look at two lengths for time bins: an hour and a day.
In general, shorter time bins are more robust when applied to short-duration anomalies of high magnitude, while longer time bins are more robust when applied to long-duration anomalies of low magnitude.
To determine how many bins I needed for the mean and standard deviation calculations, I took three things into consideration:
The concept of seasonality made me realize that it may be possible to improve anomaly detection, even in a simple mean and standard deviation analytic, by thinking about how I chose my time bins for the history. I realized that I could choose a history containing time bins that are all consecutive to each other. Or, I could choose time bins that correspond in some manner to the current time bin I wanted to evaluate.
For the corresponding method, I considered two options each for hourly and daily time bins. For hourly, I considered using the same hour of the day for some number of consecutive days and using the same hour for the same day of the week for some number of weeks. For daily, I considered using the same day of the week for some number of weeks, and the same day of the month for some number of months.
The same day of month option for daily time bins does not seem like a good option for most network services and networks. Networks change so rapidly that a history with the newest value already a month old is unlikely to create a mean and standard deviation that reflects the current network state and user and service behavior.
The other history options each have their uses. For network services that exhibit little to no seasonality on the network of interest, a consecutive history works well. A consecutive history also works for services that exhibit marked seasonality, but anomalies would need to be of greater magnitude to be detected than if a corresponding method was used. Looking at the same hour for every day works well when a network service has daily, but little to no weekly seasonality. If the history has both daily and weekly seasonality, looking at the same hour of the day for the same day of the week will be most sensitive to anomalies.
After exploring different parameters for the history, I decided on hourly bins for a 14-day consecutive history as a quick and simple hunting analytic.