TL;DR: This document specifies the data export format for version 9 of Cisco Systems' NetFlow services, for use by implementations on the network elements and/or matching collector programs.
Abstract: This document specifies the data export format for version 9 of Cisco
Systems' NetFlow services, for use by implementations on the
network elements and/or matching collector programs. The version 9
export format uses templates to provide access to observations of IP
packet flows in a flexible and extensible manner. A template defines a
collection of fields, with corresponding descriptions of structure and
semantics. This memo provides information for the Internet community.
TL;DR: This paper describes two novel and scalable schemes for measuring only "heavy" flows whose traffic is above some threshold such as 1% of the link and proposes a new form of accounting called threshold accounting which generalizes the familiar notions of usage-based and duration based pricing.
Abstract: Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping a counter for each flow is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art methods (Cisco's sampled NetFlow) which log periodically sampled packets are slow, inaccurate and resource-intensive. Previous work showed that at different granularities a small number of "heavy hitters" accounts for a large share of traffic. Our paper introduces a paradigm shift for measurement by concentrating only on large flows --- those above some threshold such as 0.1% of the link capacity.We propose two novel and scalable algorithms for identifying the large flows: sample and hold and multistage filters, which take a constant number of memory references per packet and use a small amount of memory. If $M$ is the available memory, we show analytically that the errors of our new algorithms are proportional to $1/M$; by contrast, the error of an algorithm based on classical sampling is proportional to $1/\sqrtM$, thus providing much less accuracy for the same amount of memory. We also describe further optimizations such as early removal and conservative update that further improve the accuracy of our algorithms, as measured on real traffic traces, by an order of magnitude. Our schemes allow a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee. Threshold accounting generalizes usage-based and duration based pricing.
TL;DR: Two novel and scalable algorithms for identifying the large flows are proposed: sample and hold and multistage filters, which take a constant number of memory references per packet and use a small amount of memory, and a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee.
Abstract: Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping a counter for each flow is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art methods (Cisco's sampled NetFlow), which count periodically sampled packets are slow, inaccurate and resource-intensive. Previous work showed that at different granularities a small number of "heavy hitters" accounts for a large share of traffic. Our paper introduces a paradigm shift by concentrating the measurement process on large flows only---those above some threshold such as 0.1p of the link capacity.We propose two novel and scalable algorithms for identifying the large flows: sample and hold and multistage filters, which take a constant number of memory references per packet and use a small amount of memory. If M is the available memory, we show analytically that the errors of our new algorithms are proportional to 1/M; by contrast, the error of an algorithm based on classical sampling is proportional to 1/√M, thus providing much less accuracy for the same amount of memory. We also describe optimizations such as early removal and conservative update that further improve the accuracy of our algorithms, as measured on real traffic traces, by an order of magnitude. Our schemes allow a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee. Threshold accounting generalizes usage-based and duration based pricing.
TL;DR: This paper presents a new publicly available dataset from GÉANT, the European Research and Educational Network, which consists of traffic matrices built using full IGP routing information, sampled Netflow data and BGP routing data of the GÉant network, one per 15 minutes interval for several months.
Abstract: This paper presents a new publicly available dataset from GEANT, the European Research and Educational Network. This dataset consists of traffic matrices built using full IGP routing information, sampled Netflow data and BGP routing information of the GEANT network, one per 15 minutes interval for several months. Potential benefits of publicly available traffic matrices comprise improving our understanding of real traffic matrices, their dynamics, and to make possible the benchmarking of intradomain traffic engineering methods
TL;DR: This paper is the first of its kind to provide an integrated tutorial on all stages of a flow monitoring setup, and shows, for example, how the previously opposing approaches of deep packet inspection and flow monitoring have been united into novel monitoring approaches.
Abstract: Flow monitoring has become a prevalent method for monitoring traffic in high-speed networks By focusing on the analysis of flows, rather than individual packets, it is often said to be more scalable than traditional packet-based traffic analysis Flow monitoring embraces the complete chain of packet observation, flow export using protocols such as NetFlow and IPFIX, data collection, and data analysis In contrast to what is often assumed, all stages of flow monitoring are closely intertwined Each of these stages therefore has to be thoroughly understood, before being able to perform sound flow measurements Otherwise, flow data artifacts and data loss can be the consequence, potentially without being observed This paper is the first of its kind to provide an integrated tutorial on all stages of a flow monitoring setup As shown throughout this paper, flow monitoring has evolved from the early 1990s into a powerful tool, and additional functionality will certainly be added in the future We show, for example, how the previously opposing approaches of deep packet inspection and flow monitoring have been united into novel monitoring approaches