← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1504265] [NEW] OpenStack SDN corrupts a few networking packets under load

 

Public bug reported:

It appears that some combination of OpenStack SDN under load can
silently corrupt exactly 50 bytes of TCP payload data. Only observed
with VMs that run on the OpenStack control node which is apparently out
of line with OpenStack best practices for production.

Environment was RDO OpenStack Kilo, CentOS 7, Neutron/Openvswitch.
Openvswitch Neutron plugin version is 2015.1.0-1.el7, linux is SLES11SP3
(kernel 3.0.76) 64 bit on VMware.

Steps To Reproduce:

Generate high network traffic.

>From the sender:

<pre>
sender:~ $ dd if=/dev/zero count=1000000 bs=1000 | netcat 172.16.12.100 139
1000000+0 records in
1000000+0 records out
1000000000 bytes (1.0 GB) copied, 12.5841 s, 79.5 MB/s
^C

>From the receiver:

Out of laziness using SMB port so shutdown Samba on the receiver before
repro'ing.

If OpenStack does not corrupt the networking stream:

receiver:~ $ netcat -l -p 139 > out.bin; hexdump -C out.bin
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
1dcd6500

If OpenStack *does* corrupt the packets over the network, the corruption
will be shown in the hexdump. Out of 10 tries, a few attempts yielded
corruption so it is readily reproducable, but not that often. Two
examples:

receiver:~ $ netcat -l -p 139 > out.bin; hexdump -C out.bin
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0ac9db60 00 00 00 00 00 00 61 35 61 34 39 30 34 34 62 66 |......a5a49044bf|
0ac9db70 33 30 37 33 32 36 36 35 31 31 37 36 61 38 31 66 |307326651176a81f|
0ac9db80 32 35 32 61 33 35 35 39 31 34 30 31 39 63 33 33 |252a355914019c33|
0ac9db90 32 35 31 38 38 64 62 61 00 00 00 00 00 00 00 00 |25188dba........|
0ac9dba0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
3b9aca00

receiver:~ $ netcat -l -p 139 > out.bin; hexdump -C out.bin
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
078cf070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70 ce |..............p.|
078cf080 d0 31 40 0f 77 34 f7 23 02 75 10 77 d5 5b 2f 11 |.1@.w4.#.u.w.[/.|
078cf090 b8 61 c3 4f 15 30 7f 10 c0 39 96 b5 bb f1 bc 5c |.a.O.0...9.....\|
078cf0a0 ea d7 2a de 69 80 9c d3 4a b3 24 60 67 03 8e a5 |..*.i...J.$`g...|
078cf0b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
07910130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 36 39 |..............69|
07910140 34 65 37 35 37 65 39 39 63 35 61 63 64 65 36 64 |4e757e99c5acde6d|
07910150 63 34 33 35 34 33 37 66 62 39 39 35 38 63 38 61 |c435437fb9958c8a|
07910160 39 36 66 35 36 30 31 38 63 62 35 34 61 33 65 34 |96f56018cb54a3e4|
07910170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
32a57a20 00 00 00 00 00 00 32 31 33 31 39 61 36 64 33 33 |......21319a6d33|
32a57a30 38 66 34 63 37 35 64 39 63 65 34 39 33 63 32 61 |8f4c75d9ce493c2a|
32a57a40 62 33 35 61 61 33 31 64 61 35 31 31 35 61 30 34 |b35aa31da5115a04|
32a57a50 35 66 62 39 31 34 39 34 00 00 00 00 00 00 00 00 |5fb91494........|
32a57a60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
3b9aca00
</pre>

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1504265

Title:
  OpenStack SDN corrupts a few networking packets under load

Status in neutron:
  New

Bug description:
  It appears that some combination of OpenStack SDN under load can
  silently corrupt exactly 50 bytes of TCP payload data. Only observed
  with VMs that run on the OpenStack control node which is apparently
  out of line with OpenStack best practices for production.

  Environment was RDO OpenStack Kilo, CentOS 7, Neutron/Openvswitch.
  Openvswitch Neutron plugin version is 2015.1.0-1.el7, linux is
  SLES11SP3 (kernel 3.0.76) 64 bit on VMware.

  Steps To Reproduce:

  Generate high network traffic.

  From the sender:

  <pre>
  sender:~ $ dd if=/dev/zero count=1000000 bs=1000 | netcat 172.16.12.100 139
  1000000+0 records in
  1000000+0 records out
  1000000000 bytes (1.0 GB) copied, 12.5841 s, 79.5 MB/s
  ^C

  From the receiver:

  Out of laziness using SMB port so shutdown Samba on the receiver
  before repro'ing.

  If OpenStack does not corrupt the networking stream:

  receiver:~ $ netcat -l -p 139 > out.bin; hexdump -C out.bin
  00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  1dcd6500

  If OpenStack *does* corrupt the packets over the network, the
  corruption will be shown in the hexdump. Out of 10 tries, a few
  attempts yielded corruption so it is readily reproducable, but not
  that often. Two examples:

  receiver:~ $ netcat -l -p 139 > out.bin; hexdump -C out.bin
  00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  0ac9db60 00 00 00 00 00 00 61 35 61 34 39 30 34 34 62 66 |......a5a49044bf|
  0ac9db70 33 30 37 33 32 36 36 35 31 31 37 36 61 38 31 66 |307326651176a81f|
  0ac9db80 32 35 32 61 33 35 35 39 31 34 30 31 39 63 33 33 |252a355914019c33|
  0ac9db90 32 35 31 38 38 64 62 61 00 00 00 00 00 00 00 00 |25188dba........|
  0ac9dba0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  3b9aca00

  receiver:~ $ netcat -l -p 139 > out.bin; hexdump -C out.bin
  00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  078cf070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70 ce |..............p.|
  078cf080 d0 31 40 0f 77 34 f7 23 02 75 10 77 d5 5b 2f 11 |.1@.w4.#.u.w.[/.|
  078cf090 b8 61 c3 4f 15 30 7f 10 c0 39 96 b5 bb f1 bc 5c |.a.O.0...9.....\|
  078cf0a0 ea d7 2a de 69 80 9c d3 4a b3 24 60 67 03 8e a5 |..*.i...J.$`g...|
  078cf0b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  07910130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 36 39 |..............69|
  07910140 34 65 37 35 37 65 39 39 63 35 61 63 64 65 36 64 |4e757e99c5acde6d|
  07910150 63 34 33 35 34 33 37 66 62 39 39 35 38 63 38 61 |c435437fb9958c8a|
  07910160 39 36 66 35 36 30 31 38 63 62 35 34 61 33 65 34 |96f56018cb54a3e4|
  07910170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  32a57a20 00 00 00 00 00 00 32 31 33 31 39 61 36 64 33 33 |......21319a6d33|
  32a57a30 38 66 34 63 37 35 64 39 63 65 34 39 33 63 32 61 |8f4c75d9ce493c2a|
  32a57a40 62 33 35 61 61 33 31 64 61 35 31 31 35 61 30 34 |b35aa31da5115a04|
  32a57a50 35 66 62 39 31 34 39 34 00 00 00 00 00 00 00 00 |5fb91494........|
  32a57a60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  3b9aca00
  </pre>

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1504265/+subscriptions


Follow ups