yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #86094
[Bug 1815989] Re: OVS drops RARP packets by QEMU upon live-migration causes up to 40s ping pause in Rocky
** Changed in: nova/train
Status: Fix Released => New
** Also affects: neutron/ussuri
Importance: Undecided
Assignee: Rodolfo Alonso (rodolfo-alonso-hernandez)
Status: In Progress
** Also affects: neutron/wallaby
Importance: Undecided
Status: New
** Also affects: neutron/train
Importance: Undecided
Status: New
** Also affects: neutron/victoria
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1815989
Title:
OVS drops RARP packets by QEMU upon live-migration causes up to 40s
ping pause in Rocky
Status in neutron:
In Progress
Status in neutron train series:
New
Status in neutron ussuri series:
In Progress
Status in neutron victoria series:
New
Status in neutron wallaby series:
New
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) train series:
New
Status in OpenStack Compute (nova) ussuri series:
New
Status in OpenStack Compute (nova) victoria series:
New
Status in OpenStack Compute (nova) wallaby series:
New
Status in os-vif:
Invalid
Bug description:
This issue is well known, and there were previous attempts to fix it,
like this one
https://bugs.launchpad.net/neutron/+bug/1414559
This issue still exists in Rocky and gets worse. In Rocky, nova compute, nova libvirt and neutron ovs agent all run inside containers.
So far the only simply fix I have is to increase the number of RARP
packets QEMU sends after live-migration from 5 to 10. To be complete,
the nova change (not merged) proposed in the above mentioned activity
does not work.
I am creating this ticket hoping to get an up-to-date (for Rockey and
onwards) expert advise on how to fix in nova-neutron.
For the record, below are the time stamps in my test between neutron ovs agent "activating" the VM port and rarp packets seen by tcpdump on the compute. 10 RARP packets are sent by (recompiled) QEMU, 7 are seen by tcpdump, the 2nd last packet barely made through.
openvswitch-agent.log:
2019-02-14 19:00:13.568 73453 INFO
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent
[req-26129036-b514-4fa0-a39f-a6b21de17bb9 - - - - -] Port
57d0c265-d971-404d-922d-963c8263e6eb updated. Details: {'profile': {},
'network_qos_policy_id': None, 'qos_policy_id': None,
'allowed_address_pairs': [], 'admin_state_up': True, 'network_id':
'1bf4b8e0-9299-485b-80b0-52e18e7b9b42', 'segmentation_id': 648,
'fixed_ips': [
{'subnet_id': 'b7c09e83-f16f-4d4e-a31a-e33a922c0bac', 'ip_address': '10.0.1.4'}
], 'device_owner': u'compute:nova', 'physical_network': u'physnet0', 'mac_address': 'fa:16:3e:de:af:47', 'device': u'57d0c265-d971-404d-922d-963c8263e6eb', 'port_security_enabled': True, 'port_id': '57d0c265-d971-404d-922d-963c8263e6eb', 'network_type': u'vlan', 'security_groups': [u'5f2175d7-c2c1-49fd-9d05-3a8de3846b9c']}
2019-02-14 19:00:13.568 73453 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-26129036-b514-4fa0-a39f-a6b21de17bb9 - - - - -] Assigning 4 as local vlan for net-id=1bf4b8e0-9299-485b-80b0-52e18e7b9b42
tcpdump for rarp packets:
[root@overcloud-ovscompute-overcloud-0 nova]# tcpdump -i any rarp -nev
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
19:00:10.788220 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
19:00:11.138216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
19:00:11.588216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
19:00:12.138217 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
19:00:12.788216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
19:00:13.538216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
19:00:14.388320 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1815989/+subscriptions
References