Background: Know: RMON, traffic matrices, prereq - Wireshark, TCP, HTTP, Recognize:aggregation
Why consider flows?Edit
Network operators today often use external means to get performance statistics. For instance, ISPs today routinely monitor their networks with the help of measurement boxes that inject “active probes” between routers. From the path properties observed by these probes, operators typically use inference algorithms to isolate the location of the problem. A major limitation is that they only provide aggregate statistics, which are not in general sufficient to diagnose and debug flow-specific network problems. For example, a flow (that belongs to a given customer’s application) may have passed a router exactly when the busy period started for the router’s queue, because of which that particular flow may have obtained high latency (in turn affecting its throughput). However, when packet delays are aggregated over an interval, it may appear that the forwarding path within the router is functioning normally. Thus, while ensemble averages may be sufficient for detecting generic SLA violations, they may not be enough for diagnosing any violations on a per-customer basis.
Methods used for per-flow analyzingEdit
As we want to analyze per-flow statistics instead of aggregate statistics, sampling is needed. Sampling is a method for traffic analysis where 1 in every ‘N’ packets (N is the sampling rate) passing through a device interface is sent to an analyzer tool for analysis. Based on the information in this 1 packet, traffic pattern for the rest of the packets in the sample group is constructed. There are two methods used for sampling:
- Netflow It developed by Cisco Systems aggregates conversations between hosts (i.e. flows) with potentially thousands of packets into a single entry among 29 other conversations in a single NetFlow v5 packet. In other words, a single NetFlow packet can represent tens of thousands of packets between over two dozen hosts. However the majority of the data field is lost in the aggregation. The source and destination IP addresses, protocols, type, QoS, autonomous systems and a few other fields are all that are saved. The rest of the packet is dumped in NetFlow v5 which is over 80% of the market. NetFlow v9 can save the first 1200 bytes of the packet, however, few if any collectors can report on the data intuitively.
- Sflow It developed by InMon is a packet sampling technology where the switch captures every 100th packet (configurable) per interface and sends it off to the collector. The sFlow specification does not preclude "sampling" every packet - this is a sampling rate of 1 in 1. It is up to the specific chip vendor and specific sFlow implementation to limit the maximum frequency of packet sampling. Because of sFlows sampling nature, accurate readings of traffic volumes per hosts is nearly impossible without complicated algorithms which attempt to guess at accurate conversation byte volumes. Unlike the normally software based architecture of NetFlow, sFlow requires a chip. The sFlow.org consortium includes most of the leading network equipment and network traffic analysis vendors, who have contributed to the specification of the standard.
- comparison between Netflow and Sflow
- SFlow, can work on Layer 2 and Layer 3 interfaces and does not need a Layer 3 routing or next hop as NetFlow does. This enables sFlow capture to be done on Layer 2 interfaces thus covering almost all of your network traffic. Coming to the type of traffic captured, NetFlow technology can capture only IP based traffic information and not non IP protocol traffic like IPX, Appletalk, XNS, etc. If network runs a lot of non-IP based protocols, it is only sFlow which is capable of capturing these packets. Accuracy should also be considered. SFlow, since being sampling based, may miss some of the traffic. This can happen when packets belonging to a huge conversation did not get sampled, thus losing account of a large volume of network data or when your network involves a lot of small conversations and thus these packets not getting accounted in the sample group. Finally, in the long run it may find that some of the actual top talkers missing. Here NetFlow gets the advantage as it will capture 100% of IP traffic.