← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1591386] [NEW] Possible race condition L3HA when VRRP state changes why building

 

Public bug reported:

Currently I suspect a race condition when creating neutron HA enabled
router and attaching router interfaces.

All of my router ports are stuck in build state but passing traffic.
If I pick one port from this router is shows it is still in BUILD state:

+-----------------------+--------------------------------------------------------------------------------------------------------+
| Field                 | Value                                                                                                  |
+-----------------------+--------------------------------------------------------------------------------------------------------+
| admin_state_up        | True                                                                                                   |
| allowed_address_pairs |                                                                                                        |
| binding:host_id       | controller2_neutron_agents_container-cb4bb90e                                                          |
| binding:profile       | {}                                                                                                     |
| binding:vif_details   | {"port_filter": true}                                                                                  |
| binding:vif_type      | bridge                                                                                                 |
| binding:vnic_type     | normal                                                                                                 |
| device_id             | 5b861c43-9a0d-494c-bfe4-27aeb50e94fe                                                                   |
| device_owner          | network:router_interface                                                                               |
| dns_assignment        | {"hostname": "host-10-11-12-1", "ip_address": "10.11.12.1", "fqdn": "host-10-11-12-1.openstacklocal."} |
| dns_name              |                                                                                                        |
| extra_dhcp_opts       |                                                                                                        |
| fixed_ips             | {"subnet_id": "77be837a-ddd4-40df-876f-e31f0d241d85", "ip_address": "10.11.12.1"}                      |
| id                    | 68ab5b64-d22c-4c8a-951e-8a57c1397a31                                                                   |
| mac_address           | fa:16:3e:26:c6:86                                                                                      |
| name                  |                                                                                                        |
| network_id            | 9d69083d-e229-47ea-9dd1-deef2b8e21df                                                                   |
| security_groups       |                                                                                                        |
| status                | BUILD                                                                                                  |
| tenant_id             | 96e14d3700b549fda9367a2672107a55                                                                       |
+-----------------------+--------------------------------------------------------------------------------------------------------+

Unfortunately I did not catch many details from the neutron logs just
that the VRRP election happened

VRRP state changes:
===================

controller1_neutron_agents_container-b3c216d9 | success | rc=0 >>
2016-06-10 08:00:26.728 13586 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup

controller2_neutron_agents_container-cb4bb90e | success | rc=0 >>
2016-06-10 08:00:26.493 13733 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup
2016-06-10 08:00:38.483 13733 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to master

controller3_neutron_agents_container-2442033f | success | rc=0 >>
2016-06-10 08:00:26.889 16262 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup


and when the neutron port was roughly created because of the statistics update.
The port is correctly bound to the master VRRP agent.

interface stats update:
====================

controller1:
2016-06-10 08:01:09.713 14268 INFO neutron.agent.securitygroups_rpc [req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Preparing filters for devices set(['tap68ab5b64-d2'])
2016-06-10 08:01:09.713 14268 INFO neutron.agent.securitygroups_rpc [req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Preparing filters for devices set(['tap68ab5b64-d2'])
2016-06-10 08:01:10.106 14268 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Port tap68ab5b64-d2 updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'9d69083d-e229-47ea-9dd1-deef2b8e21df', u'segmentation_id': 4, u'device_owner': u'network:router_interface', u'physical_network': None, u'mac_address': u'fa:16:3e:26:c6:86', u'device': u'tap68ab5b64-d2', u'port_security_enabled': True, u'port_id': u'68ab5b64-d22c-4c8a-951e-8a57c1397a31', u'fixed_ips': [{u'subnet_id': u'77be837a-ddd4-40df-876f-e31f0d241d85', u'ip_address': u'10.11.12.1'}], u'network_type': u'vxlan', u'security_groups': []}

controller2:
2016-06-10 08:01:08.118 14285 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-ce43ba2a-ca4b-4ef0-b3a3-45b1d2f8405a - - - - -] Port tap68ab5b64-d2 updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'9d69083d-e229-47ea-9dd1-deef2b8e21df', u'segmentation_id': 4, u'device_owner': u'network:router_interface', u'physical_network': None, u'mac_address': u'fa:16:3e:26:c6:86', u'device': u'tap68ab5b64-d2', u'port_security_enabled': True, u'port_id': u'68ab5b64-d22c-4c8a-951e-8a57c1397a31', u'fixed_ips': [{u'subnet_id': u'77be837a-ddd4-40df-876f-e31f0d241d85', u'ip_address': u'10.11.12.1'}], u'network_type': u'vxlan', u'security_groups': []}

controller3:
2016-06-10 08:01:08.214 16828 INFO neutron.agent.securitygroups_rpc [req-1a2e5c81-9a53-47b8-a901-b0cb48f17905 - - - - -] Preparing filters for devices set(['tap68ab5b64-d2'])
2016-06-10 08:01:08.550 16828 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-1a2e5c81-9a53-47b8-a901-b0cb48f17905 - - - - -] Port tap68ab5b64-d2 updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'9d69083d-e229-47ea-9dd1-deef2b8e21df', u'segmentation_id': 4, u'device_owner': u'network:router_interface', u'physical_network': None, u'mac_address': u'fa:16:3e:26:c6:86', u'device': u'tap68ab5b64-d2', u'port_security_enabled': True, u'port_id': u'68ab5b64-d22c-4c8a-951e-8a57c1397a31', u'fixed_ips': [{u'subnet_id': u'77be837a-ddd4-40df-876f-e31f0d241d85', u'ip_address': u'10.11.12.1'}], u'network_type': u'vxlan', u'security_groups': []}


I also checked the HA tenant network ports and they are all in a ACTIVE
state.

Interestingly once I restarted the neutron-linuxbride-agent on the
active agent, it suddenly put the router gateway port into a BUILD state
which was previously not.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1591386

Title:
  Possible race condition L3HA when VRRP state changes why building

Status in neutron:
  New

Bug description:
  Currently I suspect a race condition when creating neutron HA enabled
  router and attaching router interfaces.

  All of my router ports are stuck in build state but passing traffic.
  If I pick one port from this router is shows it is still in BUILD state:

  +-----------------------+--------------------------------------------------------------------------------------------------------+
  | Field                 | Value                                                                                                  |
  +-----------------------+--------------------------------------------------------------------------------------------------------+
  | admin_state_up        | True                                                                                                   |
  | allowed_address_pairs |                                                                                                        |
  | binding:host_id       | controller2_neutron_agents_container-cb4bb90e                                                          |
  | binding:profile       | {}                                                                                                     |
  | binding:vif_details   | {"port_filter": true}                                                                                  |
  | binding:vif_type      | bridge                                                                                                 |
  | binding:vnic_type     | normal                                                                                                 |
  | device_id             | 5b861c43-9a0d-494c-bfe4-27aeb50e94fe                                                                   |
  | device_owner          | network:router_interface                                                                               |
  | dns_assignment        | {"hostname": "host-10-11-12-1", "ip_address": "10.11.12.1", "fqdn": "host-10-11-12-1.openstacklocal."} |
  | dns_name              |                                                                                                        |
  | extra_dhcp_opts       |                                                                                                        |
  | fixed_ips             | {"subnet_id": "77be837a-ddd4-40df-876f-e31f0d241d85", "ip_address": "10.11.12.1"}                      |
  | id                    | 68ab5b64-d22c-4c8a-951e-8a57c1397a31                                                                   |
  | mac_address           | fa:16:3e:26:c6:86                                                                                      |
  | name                  |                                                                                                        |
  | network_id            | 9d69083d-e229-47ea-9dd1-deef2b8e21df                                                                   |
  | security_groups       |                                                                                                        |
  | status                | BUILD                                                                                                  |
  | tenant_id             | 96e14d3700b549fda9367a2672107a55                                                                       |
  +-----------------------+--------------------------------------------------------------------------------------------------------+

  Unfortunately I did not catch many details from the neutron logs just
  that the VRRP election happened

  VRRP state changes:
  ===================

  controller1_neutron_agents_container-b3c216d9 | success | rc=0 >>
  2016-06-10 08:00:26.728 13586 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup

  controller2_neutron_agents_container-cb4bb90e | success | rc=0 >>
  2016-06-10 08:00:26.493 13733 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup
  2016-06-10 08:00:38.483 13733 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to master

  controller3_neutron_agents_container-2442033f | success | rc=0 >>
  2016-06-10 08:00:26.889 16262 INFO neutron.agent.l3.ha [-] Router 5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup

  
  and when the neutron port was roughly created because of the statistics update.
  The port is correctly bound to the master VRRP agent.

  interface stats update:
  ====================

  controller1:
  2016-06-10 08:01:09.713 14268 INFO neutron.agent.securitygroups_rpc [req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Preparing filters for devices set(['tap68ab5b64-d2'])
  2016-06-10 08:01:09.713 14268 INFO neutron.agent.securitygroups_rpc [req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Preparing filters for devices set(['tap68ab5b64-d2'])
  2016-06-10 08:01:10.106 14268 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Port tap68ab5b64-d2 updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'9d69083d-e229-47ea-9dd1-deef2b8e21df', u'segmentation_id': 4, u'device_owner': u'network:router_interface', u'physical_network': None, u'mac_address': u'fa:16:3e:26:c6:86', u'device': u'tap68ab5b64-d2', u'port_security_enabled': True, u'port_id': u'68ab5b64-d22c-4c8a-951e-8a57c1397a31', u'fixed_ips': [{u'subnet_id': u'77be837a-ddd4-40df-876f-e31f0d241d85', u'ip_address': u'10.11.12.1'}], u'network_type': u'vxlan', u'security_groups': []}

  controller2:
  2016-06-10 08:01:08.118 14285 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-ce43ba2a-ca4b-4ef0-b3a3-45b1d2f8405a - - - - -] Port tap68ab5b64-d2 updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'9d69083d-e229-47ea-9dd1-deef2b8e21df', u'segmentation_id': 4, u'device_owner': u'network:router_interface', u'physical_network': None, u'mac_address': u'fa:16:3e:26:c6:86', u'device': u'tap68ab5b64-d2', u'port_security_enabled': True, u'port_id': u'68ab5b64-d22c-4c8a-951e-8a57c1397a31', u'fixed_ips': [{u'subnet_id': u'77be837a-ddd4-40df-876f-e31f0d241d85', u'ip_address': u'10.11.12.1'}], u'network_type': u'vxlan', u'security_groups': []}

  controller3:
  2016-06-10 08:01:08.214 16828 INFO neutron.agent.securitygroups_rpc [req-1a2e5c81-9a53-47b8-a901-b0cb48f17905 - - - - -] Preparing filters for devices set(['tap68ab5b64-d2'])
  2016-06-10 08:01:08.550 16828 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-1a2e5c81-9a53-47b8-a901-b0cb48f17905 - - - - -] Port tap68ab5b64-d2 updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'9d69083d-e229-47ea-9dd1-deef2b8e21df', u'segmentation_id': 4, u'device_owner': u'network:router_interface', u'physical_network': None, u'mac_address': u'fa:16:3e:26:c6:86', u'device': u'tap68ab5b64-d2', u'port_security_enabled': True, u'port_id': u'68ab5b64-d22c-4c8a-951e-8a57c1397a31', u'fixed_ips': [{u'subnet_id': u'77be837a-ddd4-40df-876f-e31f0d241d85', u'ip_address': u'10.11.12.1'}], u'network_type': u'vxlan', u'security_groups': []}


  I also checked the HA tenant network ports and they are all in a
  ACTIVE state.

  Interestingly once I restarted the neutron-linuxbride-agent on the
  active agent, it suddenly put the router gateway port into a BUILD
  state which was previously not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1591386/+subscriptions