← Back to team overview

openstack team mailing list archive

Re: Strange network behavior

 

On 11/09/2012 09:14 AM, Joe Warren-Meeks wrote:
What I am seeing in Tcpdump is a lot of incorrect cksums. This happens
with all Tcp connections.

17:12:38.539784 IP (tos 0x0, ttl 64, id 53611, offset 0, flags [DF],
proto TCP (6), length 60)
     10.0.0.240.56791 > 10.0.41.3.22: Flags [S], cksum 0x3e21 (incorrect
-> 0x6de2), seq 2650163743, win 14600, options [mss 1460,sackOK,TS val
28089204 ecr 0,nop,wscale 6], length 0


17:12:38.585279 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 60)
     10.0.41.3.22 > 10.0.0.240.56791: Flags [S.], cksum 0x3e21
(incorrect -> 0xe5c5), seq 1530502549, ack 3098447117, win 14480,
options [mss 1460,sackOK,TS val 340493 ecr 28089204,nop,wscale 3], length 0

Anyone come across this before?

When a Network Interface card (NIC) offers ChecKsum Offload (CKO) in the outbound/transmit direction, the computation of the Layer 4 (eg TCP, UDP) checksum is deferred to the NIC. You can see if a given interface/NIC has checksum offload, or other offloads, enabled via "ethtool -k <interface>"

When the packet passes the promiscuous tap on the way down the stack to a NIC offering CKO, the packet will be in essence "unchecksummed" and so tcpdump will report that as an incorrect checksum. It is therefore possibly a false positive. I say possibly because I just did a quick netperf test on my Ubuntu 11.04 workstation to see what the SYN's looked like there, and I didn't see an incorrect checksum warning out of tcpdump though I know the egress interface is offering outbound CKO, making me think that TCP may not bother with CKO for small segments like SYNchronize segments. One way to check if the incorrect checksum report is valid would be to run tcpdump on 10.0.41.3 as well. And/or disable CKO if you see it is enabled in ethtool.

I would not have expected to see invalid checksums reported by tcpdump for an "inbound" packet though. Might be good to cross-check with the netstat statistics.

There is what appears to be an inconsistency between those two TCP segments. The sequence number of the SYNchronize (that 'S' in flags) segment from 10.0.0.240.56791 to 10.0.41.3.22 is 2650163743. The SYN from 10.0.41.3.22 to 10.0.0.240.56791 though has the ACK flag set ('.') but the ACKnowledgement number is 3098447117 rather than what I would have expected - 2650163744.

FWIW, that there was a SYN-ACK sent in response to the SYN in the first place suggests that 10.0.41.3 received what it thought was a properly checksummed SYN segment. All the more reason I suspect to take traces at both ends and compare the packets byte by byte.

rick jones




References