Best Practices in Network Traffic Analysis: Three Perspectives
In July of this year, a major overseas shipping company had its U.S. operations disrupted by a ransomware attack, one of the latest attacks to disrupt the daily operation of a major, multi-national organization. Computer networks are complex, often tightly coupled systems; operators of such systems need to maintain awareness of the system status or disruptions will occur. In today's operational climate, threats and attacks against network infrastructures have become far too common. At the SEI's CERT Division Situational Awareness team, we work with organizations and large enterprises, many of whom analyze their network traffic data for ongoing status, attacks, or potential attacks. Through this work we have observed both challenges and best practices as these network traffic analysts analyze incoming contacts to the network including packets traces or flows. In this post, the latest in our series highlighting best practices in network security, we present common questions and answers that we have encountered about the challenges and best practices in analysis of network traffic, including packets, traces, or flows.
How critical is the role of the network traffic analyst in an organization's security operations center (SOC)?
Angela: A network traffic analyst looks at communications between devices. In a security context, they do it to detect threats, such as undetected malware infections, data exfiltration, denial of service (DoS) attempts, unauthorized device access, etc. Network traffic analysis is one part of security analysis that provides insight into communications between technological assets into how they are being used and how they can go wrong. The more assets talk to each other, the more important network analysis becomes. Some types of misuse are much easier to find looking at communications (e.g., DoS, malware signaling) than events that are captured on a host (e.g., login attempts, virus detections).
Not all organizations need a full-time network analyst. Smaller organizations may have a security team where everyone handles all aspects of security. The larger or more complex an organization's network becomes, however, the more important it is to have one or more people whose primary responsibility is to protect from, detect, and respond to network-involving events.
There are two views applied to network analysis that are not mutually exclusive: analysis as a set of rules, playbook, or scripted (automated) workflow versus analysis as a hunting, awareness, or exploratory process. Both views are needed. The first view is appropriate for handling common threats--spam carrying malicious attachments, virus detections, etc. The second view is needed to handle activity that is harder to detect, such as advanced persistent threats, data exfiltration, etc. The first view is a rote activity. The second view is very creative.
Tim: The network traffic analyst is the one who watches what is happening on the network as opposed to on its hosts. The network traffic analysts tend to look at the wide scope of the activity, as opposed to specific changes on hosts.
Despite that wide scope, network analysts work in partnership with people that do host forensics and examine what is happening on the computers on the network, as opposed to what is happening on the network itself. The analysts provide an unbiased look at the information moving across the network, whether malicious or not. This unbiased view lets analysts also operate in partnership with network traffic engineers who examine whether things happening that are supposed to be happening. Traffic analysts either work on one side or the other: Are things being blocked that are supposed to be blocked or are things happening that are supposed to be happening?
Timur: The analyst is the one who understands how things work on the network, and when they aren't working, why they aren't working. This analysis includes understanding at a deep level how things on the network work together. Analysts also look at the utilization of the network between different devices, to determine if there is enough capacity to let the applications run with optimal performance. Analysts monitor what applications run on the network, and how the applications are communicating with each other. The environment that allows attackers to impact networks is often unknown. Network traffic analysis supports network situational awareness in understanding the baseline of the environment being defended.
What are some of the challenges that network traffic analysts face?
Timur: Although networking is about communications, defending the network is not about just keeping the lights blinking, it is about understanding the mission of the components on the network. Network traffic analysts must work with application owners to make sure that the dependencies are understood and not impacting other parts of the organization. Only by understanding the needs of the enterprise can analysts effectively support efforts in its defense.
Tim: The first challenge that I see is dealing with the myriad of data that is available. Network traffic analysts must review log entries, packet capture, firewall or intrusion detection system (IDS) alerts, logs on affected systems, plus routing information or passive domain name system resolution records (pDNS). To gain a better understanding of network status or malicious activity on the network, a network traffic analyst must understand the role that each of these would play towards completing a picture of the activity on the network.
The malicious activity could be a security event. The network status could be understanding
- (If we were proposing to make a change) what would this do to affected assets?
- (If be a particular computer were overloaded) why the overload occurred and what impacts might occur if we relieved that?
In each of these cases, the analyst would be integrating a variety of network information to build a consistent picture from the network traffic. The analyst would also be looking to defend the conclusions inherent in that picture.
Angela: I think that one of the biggest challenges faced by network analysts is lack of clear missions and priorities. This problem can arise from a lack of organizational technology and cyber usage policies or organizational technology and cyber usage policies that lack specific detail. For example, a policy may state that devices must be kept up-to-date but doesn't include any timeline, such as patches must be applied within one week of availability. When it is unclear what an organization allows, it is hard to figure out what constitutes a security event. When an organization's policies are permissive (e.g., organizations not willing to block access to some websites or allow peer-to-peer file sharing), it is hard to find threats in all the noise that users generate.
Unclear missions and priorities can also arise from poorly defined analysis processes, analyst roles and authority, and tools and data available for analysts. In other words, security operations center (SOC) management and resources can make it hard for analysts to focus on threats that would have high impact. For example, if the analysis tools, data, and/or processes only exist for virus detection or suspicious URLs sent in emails, but the organization has important intellectual property, protecting that intellectual property will take a backseat to dealing with filtering email and cleaning up potential virus infections.
In addition, unclear priorities can arise from an analyst's lack of understanding of existing assets, how they are used, and work products of various organizational departments (e.g., the work product of a finance department may be payroll).
What best practices in network traffic analysis have you observed?
- Have security policies that define what is allowed on the network and who is allowed to access what systems. This practice is context-dependent and thus varies from organization to organization. An example of a security policy that met this requirement would be the following: File transfer protocols (FTP, SSH, etc.) are not permitted to external endpoints, with the following exceptions: web developers may upload to our hosting provider, accounting personnel may upload and download documents from our payroll vendor.
- Know your assets. Know what devices, people, policies, vendor integrations, etc. that exist in the organization.
- Know your security appliances. These are dedicated hardware devices to monitor your network and regulate its traffic. This knowledge includes understanding security applicant locations/vantage (e.g., understand what security appliances can see) and configurations (e.g., what they detect/block).
- Be smart about sensor placement. A sensor is a security appliance that monitors network traffic and generates data describing that traffic. Sensors at the border only provide insight into what is coming into or, probably more importantly, leaving the network. Sensors between business units or locations give more insight, but may still not show you about adversary movements (e.g., compromises) with in the network (also known as "lateral movement").
- Give analysts well-defined missions. If analysts don't know what they are expected or authorized to do and do not understand the priority rankings of assets, services, or business functions, they will make it up as they go. In the absence of a well-defined mission, analysts will rely on previous experience, which means that their efforts may not adequately support organizational goals and needs.
For example, find bad stuff is a poorly defined mission statement that we have encountered through our work. In response to this mission statement, analysts tend to focus on easy-to-detect threats, e.g., reports of malicious email, much of which turns out to be spam that was already detected and mitigated by security appliances. As a result, analysts spend less time searching for high-impact threats, such as data exfiltration.
To ignore network traffic often means that attacks that might have been easily remediated go undetected. After an attack is eventually discovered, remediation takes a longer time, runs the risk of being incomplete, or network traffic analysts may have a hard time determining the root cause.
Timur: Avoid getting too vendor-driven. While tools and building skills with tools are important, analysts need to keep the perspective that the function is important rather than buying into the mindset that their job is to use specific tools.
Tim: Here are some effective best practices that I have observed:
- Work to isolate traffic to the timeframe of the activity, and focus on those data that provide relevant details. For example, if you are concerned about a web server becoming overloaded, you might exclude file transfer protocol (FTP) activity from your analysis.
- Work to identify whether an activity supports versus challenges conclusions about the activity. For a web server overload, are there attempted contacts that never complete? Is the response time increasing for attempted connections? Is this happening because of increased numbers of contacts (e.g., a flash crowd or a denial of services attempt)? Is this happening because of equipment failure or upgrade activity (e.g., reflected in shifts in the IP addresses used by a load-balancing web site)? Analysts need to consider many possible options and then see which ones the data most clearly support.
- Finally, consider how to clearly present the conclusions--in graphs, in tables, and in prose descriptions using terminology relevant to the audience.
How do you see the role of network traffic analysts evolving? What challenges do you see network traffic analysts facing in the next five years
Tim: Network traffic analysis has historically been an ad hoc activity, requiring high expertise and intense effort. We are going to see more regularization of analysis, based on formalisms that are being developed now. These improvements will allow more clarity and traceability in the analysis process, which are often lacking in common practice. It will also simplify management of analysis activity and make a stronger association between data and conclusions suitable for particular network issues.
Timur: Networks are constantly evolving and the demands on resources are increasing at a steady pace. What network traffic analysts used to manage is no longer as simple. The increased usage of encryption decreases visibility into network traffic and the volume of enterprise applications that are now outsourced to vendors and platforms that are now "in the cloud" make defending the enterprise more challenging. What security controls that exist within the enterprise are not replicated in those externally sourced solutions? How does an analyst defend an application on a cloud service provider that uses a multi-tenant architecture and has an oversubscription model and is encountering resource contention because of issues with a different tenant?
Angela: An upcoming challenge I see is as organizations acquire new products, many of which are beginning to incorporate machine learning (ML) and artificial intelligence (AI). Analysts will therefore need to understand how to validate the results the products produce and make adjustments to their workflows.
What resources are out there for network traffic analysts?
As outlined in a previous blog post, there are a number of resources available to network analysts and security defenders as they contend with rapid-fire increases in global internet protocol traffic:
- CERT's 2019 FloCon conference provides a forum for exploring large-scale, next-generation data analytics in support of security operations. FloCon is geared toward operational analysts, tool developers, researchers, and security professionals interested in analyzing and visualizing large data sets to protect and defend network systems.
- The Network Situational Awareness (NetSA) group at CERT has developed SiLK, the System for Internet-Level Knowledge, a collection of traffic analysis tools to facilitate security analysis of large networks. The SiLK tool suite supports the efficient collection, storage, and analysis of network flow data, enabling network security analysts to rapidly query large historical traffic data sets.
- CERT researchers have also published a series of case studies that are available as technical reports. In particular, the report Network Profiling Using Flow provides a step-by-step guide for profiling and discovering public-facing assets on a network using network flow data.
We welcome your feedback about this work in the comments section below.
Learn more about FloCon.
Read Tim Shimeall's blog post, Traffic Analysis for Network Security: Two Approaches for Going Beyond Network Flow Data.
Read other blog posts in the ongoing series from CERT researchers, Best Practices in Network Security.