← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1831404] [NEW] rarp packet will be dropped in flows cause vm connectivity broken after live-migration

 

Public bug reported:

When we live-migration a VM, at the moment of VM start in destination
nodes, the VM will send 5 rarp packets(broadcast) to trigger mac learn
for switch, then other same network VMs can send packets to this VM
directly, but actually these rarp packets will be dropped in flows of
br-int, they won't trigger mac learn for switch, then the connectivity
of other VMs to this VM will be broken, they need a broadcast to search
the new position of this VM, it will waste some few seconds.

Without live-migration, you can use nping command in VM to simulate rarp packets sent from VM:
nping --arp --arp-type rarp  --arp-target-mac <vm-mac> --ether-type rarp  <vm-dhcp-host-ip>
For example:
nping --arp --arp-type rarp  --arp-target-mac fa:16:3e:02:49:d3 --ether-type rarp  192.168.111.2

And you can use tcpdump to capture these packets, and also you can see
these packets are dropped in (table 71 , priority=10, in_port=16), 16 is
the tap device port in br-int

# ovs-ofctl show br-int | grep 02:49:d3
 16(tap80be4c89-6d): addr:fe:16:3e:02:49:d3
#
# ovs-ofctl dump-flows br-int  | grep table=71 | grep in_port=16  | grep priority=10
 cookie=0xbb76a2777919bc71, duration=1037.306s, table=71, n_packets=13, n_bytes=594, idle_age=12, priority=10,ct_state=-trk,reg5=0x10,in_port=16 actions=drop
#
# tcpdump 'rarp or arp' -i tap80be4c89-6d -nne
tcpdump: WARNING: tap80be4c89-6d: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap80be4c89-6d, link-type EN10MB (Ethernet), capture size 65535 bytes
10:11:11.895958 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
10:11:12.896280 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
10:11:13.897538 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
10:11:14.898815 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
10:11:15.900093 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
# ovs-ofctl dump-flows br-int  | grep table=71 | grep in_port=16  | grep priority=10
 cookie=0xbb76a2777919bc71, duration=1063.414s, table=71, n_packets=18, n_bytes=804, idle_age=8, priority=10,ct_state=-trk,reg5=0x10,in_port=16 actions=drop

** Affects: neutron
     Importance: Undecided
     Assignee: Yang Li (yang-li)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1831404

Title:
  rarp packet will be dropped in flows cause vm connectivity broken
  after  live-migration

Status in neutron:
  In Progress

Bug description:
  When we live-migration a VM, at the moment of VM start in destination
  nodes, the VM will send 5 rarp packets(broadcast) to trigger mac learn
  for switch, then other same network VMs can send packets to this VM
  directly, but actually these rarp packets will be dropped in flows of
  br-int, they won't trigger mac learn for switch, then the connectivity
  of other VMs to this VM will be broken, they need a broadcast to
  search the new position of this VM, it will waste some few seconds.

  Without live-migration, you can use nping command in VM to simulate rarp packets sent from VM:
  nping --arp --arp-type rarp  --arp-target-mac <vm-mac> --ether-type rarp  <vm-dhcp-host-ip>
  For example:
  nping --arp --arp-type rarp  --arp-target-mac fa:16:3e:02:49:d3 --ether-type rarp  192.168.111.2

  And you can use tcpdump to capture these packets, and also you can see
  these packets are dropped in (table 71 , priority=10, in_port=16), 16
  is the tap device port in br-int

  # ovs-ofctl show br-int | grep 02:49:d3
   16(tap80be4c89-6d): addr:fe:16:3e:02:49:d3
  #
  # ovs-ofctl dump-flows br-int  | grep table=71 | grep in_port=16  | grep priority=10
   cookie=0xbb76a2777919bc71, duration=1037.306s, table=71, n_packets=13, n_bytes=594, idle_age=12, priority=10,ct_state=-trk,reg5=0x10,in_port=16 actions=drop
  #
  # tcpdump 'rarp or arp' -i tap80be4c89-6d -nne
  tcpdump: WARNING: tap80be4c89-6d: no IPv4 address assigned
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on tap80be4c89-6d, link-type EN10MB (Ethernet), capture size 65535 bytes
  10:11:11.895958 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
  10:11:12.896280 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
  10:11:13.897538 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
  10:11:14.898815 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28
  10:11:15.900093 fa:16:3e:02:49:d3 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Reverse Request who-is fa:16:3e:02:49:d3 tell fa:16:3e:02:49:d3, length 28

  ^C
  5 packets captured
  5 packets received by filter
  0 packets dropped by kernel
  # ovs-ofctl dump-flows br-int  | grep table=71 | grep in_port=16  | grep priority=10
   cookie=0xbb76a2777919bc71, duration=1063.414s, table=71, n_packets=18, n_bytes=804, idle_age=8, priority=10,ct_state=-trk,reg5=0x10,in_port=16 actions=drop

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1831404/+subscriptions


Follow ups