Mismatch between SNMP and Packet based counters
We recently encountered an interesting problem with a leading cloud based financial services provider who uses Trisul for deep monitoring of application and bandwidth usage. Hope this helps other folks facing a similar problem.
The problem
Our customer is a cloud based financial services company which buys bandwidth in bulk from an upstream provider. The upstream provider gives them rough SNMP based traffic reports for each day. From their end, the financial services company uses Trisul to break up the traffic and provide much more fine grained reports to their end users. The problem was that the SNMP based numbers did not match Trisuls packet based numbers. Trisul always showed a 3% – 4% smaller number daily. For their volumes this added up to $80 daily.
What happened to the missing packets
We banged our heads for a couple of days trying to figure out what was going on. We confirmed the following
- Trisul wasnt dropping any packets (even at a moderately high sustained rate of 30-40,000 pps) – neither was the kernel.
- Trisul was in inline mode w/ hardware bypass, not hanging off a tap or span.
- The cable directly connected the switch from upstream end to Trisul
- The cable was brand new in a high quality data center
It turns out the SNMP stats were produced by polling the ifHCInOctets/ifHCOutOctets counters in mib-2 interfaces group. For Ethernet interfaces this is what RFC 3635 says.
The Interface MIB octet counters, ifInOctets, ifOutOctets,
ifHCInOctets and ifHCOutOctets, MUST include all octets in valid frames sent or received on the interface, including the MAC header and FCS, but not the preamble, start of frame delimiter, or extension octets.
Note that the 7 byte preamble + 1 byte SFD isnt included, just the 4 byte CRC.
Trisul of course did not see the CRC because we were relying on the PF_PACKET mechanism to supply packets. Once we added the 4 bytes to every packet we found that the numbers tallied.
How to compensate for FCS
We find that only folks who use Trisul to check or implement billing care about matching the SNMP counters. So we introduced a new option on a per interface basis called AddEthernetFCS. This is disabled by default. When enabled it will add 4 bytes to the reported packet length. To use this option Login as admin and select Manage Capture Profiles > Select an interface > Enable the AddEthernetFCS option.