Monitoring modern computer links is not a trivial task. With commerical IP services now performing at 40Gbps, and 10Gbps becoming common place, the computational demands of "keeping up" can be high. For Argus to perform well, it will need some good computational resources. This section will present what we know about hardware and configurations that should get you going pretty well.
Lets start at the link being monitored and talk about getting access to the data on the wire, preferably without impacting that data :-). The options are (more less from cheapest/worst to best most expensive):
By definition only works on half duplex links, are hard to find these days (switches which are much more common won't work) and limited to 10/100. Works fine on an ADSL line and at low link speeds to tap in to the argus sensor to archive host data flow for trouble shooting (a copper tap will do this better but you need an fdx sniffer then).
The major advantage of span ports is that most network switches support them and thus they may be present for "free" (for some value of free)
and may be easier to arrange than inserting a network tap (which will cause a network outage).
As noted above I dislike this solution. It has a number of major flaws: it operates in the production network switch and thus can affect the production
network traffic (which the solutions below can't) and unless the destination port is one speed higher than the monitored port (i.e. monitor link 10 megs
monitor port 100 meg) the link utilization must be less than %50 to avoid packet loss as both the transmit and receive data will be sent to the span
port transmit line (i.e. potentially 200 megabits per second in to a 100 megabit per second port). This will at best result in %50 packet loss on a
busy or bursting link and at worst run the switch out of memory buffers and affect the production network (been there done that, speaking from painful
experience :-).
The right answer (if you can't afford one of the ones below of course). These devices made by lots of people (I prefer Netoptics from long experience with them) are a set (typically 4) ethernet PHY chips that terminate both sides of the monitored connection and pass that data from one link port to the other. In addition the receive side of the monitored connection goes to the 3rd PHY chip and the transmit side goes to the 4th PHY chip (receive data on these two monitor ports is discarded other than for autonegotiation so the monitor device can't affect the monitored link. There is a variation on this from Netoptics called a regen tap which increases the number of monitor PHYs (in pairs) from 2 to 16 monitor ports. This allows multiple devices (argus sensors, sniffers. IDS systems) to get the same line data at the same time. Very useful but a lot more money than a single port tap. Some taps (finisar and OEMed by Fluke are one) have a relay that bypasses the tap function on the monitored link so a power failure on the tap doesn't take down the production link. Some (netoptics gig for instance) don't. A risk to be aware of. If this is an issue look at passive optical taps below.
If your link is optical then you have the option of passive optical taps (my favorite device for this purpose!). Since they are passive they are not a threat (other than being physically damaged the same as the fibre) to the production network. Unfortunatly nothing is free. Being passive they operate by stealing some amount of optical power (they are available in ranges from %90/%10 to %50/%50) from the monitored link. That has several consequences. First it will erode your optical power budget on the monitored link (from %10 for an %90/%10 to %50 for a %50/%50) so you need to take this in to account and make sure you have sufficient optical power for the link to work and that the %90 reduction out the monitor port is sufficient for your analyser! I tend to like %80/%20 taps and have never had a problem with the monitor ports on those taps. Always check the power levels with an optical level meter and make sure the monitor ports are above the minimum level for the NIC card as otherwise you can get unreported packet loss here!
There is another perhaps not obvious limitation here relating to long haul links. Assuming your tap is at one end or the other (i.e. not in the middle) of such a link, you will very likely have a problem on the receive side. The receive signal has already suffered the attenutation of the fibre from the far end. While %80 of that power will probably drive the production link just fine, %80 attentation of the already attenuated signal likely won't work in to the monitor port. The solution to this is to tap the link on a short local connection (perhaps using a media converter such as one from MRV in line before the LR GBIC) without much attenuation. Unfortunatly at 10 gigs this can get expensive (but everything about 10 gigs is expensive :-)).
While mentioning 10 gigs here is another interesting point: a 10 gig optical tap will work just fine on 1 gig link (and is about twice the price of the 1 gig tap). If you are likely to move up to 10 gigs it may be a good plan to buy 10 gig capable taps for your 1 gig links.
Another limitation: make sure the connnections from the monitor ports to the NIC cards are short. The power out the monitor port (unless you are using %50/%50 taps) is small and the attenuation of a long fibre run can reduce the power level below that required by the NIC card and cause invisible packet loss. Measure the output power with an optical power meter and make sure the power level is within spec for the NIC cards in use. If the power is too low consider a regen tap which will boost the monitor port power back up to normal levels after the tap (and you can thus use long patch cables).
While Netoptics makes optical regen taps (up to 16 monitor ports again) if you only need a two port unit, two %80/%20 optical taps can do the trick. The two %80 taps connect in series along the monitored link introducing a %36 power attenuation (80% and then 80% again). This additional "hit" is significant, so do watch the power budget on all the links! The two monitor ports will end up with 20% and 16% of the power. A real regen tap is a better (but more expensive) bet as it will regenerate the signals and give you full strenght out the monitor ports.
Now we are in big boy country :-). These can take a 10/40 gig link and distribute the traffic to multiple 1/10 gig ports or take multiple 1 gig ports and aggregate them in to a 10 gig monitor port in to a 10 gig capture box (the case I have used a Net Director for in to an Endace Ninja 10 gig capture appliance). They are powered and thus are a potential threat to the production network (in our case there are passive optical taps feeding the Net Director so this isn't a concern) if they are connected inline. They will partition the traffic according to filter rules (bpf syntax). I believe that this technology is going to be the answer for wire speed at 10, 40 and 100 gigabit links. It isn't there now but with a new routing protocol that associates a 5 or n tuple (adding VLAN / MPLS tags in the ntuple case) to an output interface so the load can be evenly distributed among a farm of sensor machines running at a gig or so is likely the only way to succeed at the higher line rates.
Having safely got the data from the monitored link now we need to get it in to the sensor without losing it and preferably (if not affordably) with accurate time stamps. The best way to do this is with Endace DAG capture cards. While they are 10 times the cost of an Intel server NIC (at gig) they have pretty much been the gold standard of capture cards for many years. I believe (if someone has access to Endace tech support, since I'm no longer a customer, please ask!) that the Ethernet MAC is implemented in the on card FPGA along with the counter that is keeping ntp synced time. That means when the MAC detects sync (i.e. preamble has synced the clocks and the 1st bit of the destination MAC is being recieved) it can (and I hope does) record the timestamp as packet arrival. There is an interesting project below in the research topics section for someone interested relating to this against commodity NIC time stamping :-). In addition there is a CPU (used to be an Intel I960) and 4 megabit sram buffer on card. That means the host CPU has a bit more time than with an Intel NIC (48k Byte fifo) to extract the data from the card to main memory before packet loss will occur. Endace also advertises a 0 copy software solution (I expect their version of the pcap library) to the application software. As we will see in a bit when we look at kernel bpf processing this is an important advantage at high link speeds. In the likely event that you can't afford DAG cards then your next best option is an Intel Pro1000 server NIC (note server NIC not workstation NIC). I can attest from experience that a Pro1000 server Nic in a sufficiently fast machine can do line speed at Gig (I don't have a lot of experience at 10 gig, some of the rest of you will need to pipe up there :-)). SysKonnect is another one that will do wire speed gig. Other brands may do so but you need to test with card and drivers as some of them can not keep up at wire speed. As noted above the RX fifo in the NIC is 48 Kbytes that means that the host CPU needs to be able to move the packets out of the fifo before it fills. At line rate that isn't going to be all that long (in fact around 384,000 bits or about 384 micro seconds to respond at a 1 gig line rate).
If the CPU doesn't read the data before another packet arrives, the new packet will be lost and the interface statistics should report a NIC overrun error to indicate the loss. If you see large numbers of overrun errors on the NIC(s) monitoring your link you can be pretty sure you are losing packets.
There is also a source of inaccuracy (but not packet loss) when using a commodity NIC card. That resolves around the libpcap packet arrival time stamp. As I noted on the DAG discussion the correct place to do this is in the NIC MAC layer when the sync is detected. On a commodity NIC this doesn't happen. The packet isn't time stamped at all, it is merely grabbed, crc checked and if the crc is good copied in to the receive fifo. The libpcap time stamp in this case is inserted by the CPU when it services the interrupt from the NIC card. Typically the ethernet driver will read all the packets that the card has available and uses the current time_t timestamp value as the arrival time of the packet. This is of course inaccurate, the packet actually arrived some time ago (due to interrupt latancy) and subsequent packets will get a timestamp that is a DMA (packet lenght dependent) time later than the first one but isn't indicative of when the packet really arrived. In addition in the case of a fdx link being captured by two NIC in the same machine as alluded to before it is possible that the receive card is serviced first and the receive packets get time stamps of n to (n + # packets * dma time) now the transmit NIC gets serviced and its packets get timestamped starting with (n + # rx packets * dma time + delay) which will be later than any of the receive timestamps. This is the source of the problem where a response packet can have a time stamp before the request packet which used to confuse argus (argus now treats rx and tx timestamps as possibly offset from each other and deals with it :-)). As we see as well as causing reassembly issues (where we gat the ack before apparantly sending a syn) it also causes jitter inaccuracy as the packet time stamps aren't necessarily accurate.
- Peter Van Epp, Aug 2009
Page Last Modified: 14:22:39 EDT 13 Mar 2012 ©Copyright 2000 - 2012 QoSient, LLC. All Rights Reserved.