search menu icon-carat-right cmu-wordmark

Smart Collection and Storage Method for Network Traffic Data

Hi, this is Angela Horneman from the CERT Situational Awareness Analysis team. Recently, Nathan Dell and I were asked to explore ways to improve network traffic data storage by determining what data to store to meet organizational needs. Our research, brainstorming, and discussions led us to create a methodology to help organizations determine what types of traffic to collect and what parts of the collected traffic to keep.

All organizations are different. They have different network architectures; different concerns for potential cyber-attacks; different types of information that they permit to be sent, retrieved, and stored; and different applications that use their network. One thing that is constant is that organizations benefit from knowing what is going on in their networks. This is the purpose of intrusion detection systems, firewall logs, and--what we are talking about today--network traffic capture solutions.

Network traffic capture solutions record data about network communications that traverse different points in a network. This data can be very detailed, capturing a complete copy of all communication packets. This is commonly called full packet capture or Pcap, and is useful for determining not only who was communicating, but also determining the actual purpose for the communications and what was communicated. The information can be a summary that simply tells what IP addresses communicated when and with how much data. This is commonly termed network flow. The information can also be somewhere in-between and tell not only who communicated when, but also what application was used or any other number of augmenting points. This is commonly called metadata, but for technical and semantic reasons, we refer to it as augmented flow.

From a security perspective, capturing all network communication as full packet capture is the gold standard--it can be used for the most in-depth study of exploits, both those that occur from the outside or those that are the result of a malicious insider. Unfortunately, full packet capture on a network collects a lot of data. It is not uncommon for multiple terabytes or even petabytes of data to pass through the networks of larger organizations in one day. However, storing this data is costly, and in some cases is technically unfeasible. Many organizations have trouble storing data from more than a day or two. This means that when successful attacks are not discovered immediately, which is almost always, the network traffic data that could be used to investigate the attack has likely been overwritten.

To help organizations move to a network traffic capture program that provides the information they need for a useful time period, we have released Smart Collection and Storage Method for Network Traffic Data. This report provides a methodology that walks organizations through the process of evaluating what network traffic packets they need, what information from those packets is most useful, and how long they need to keep each level of detail.

What do you think? Could this methodology benefit your organization? What do you perceive as the benefits of Pcap over network flow or vice versa? Are there other capture or storage related aspects that you would like to see addressed? Let us know.

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed