← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1493132] [NEW] Neutron does not work properly L3 HA VRRP

 

Public bug reported:

I'm using Openstack Kilo in Redhat 7. 
Version of Neutron package in RDO: openstack-neutron-2015.1.1-1.el7.noarch

I have two nodes of Neutron in which I running neutron-l3-agent. I've
configured HA with VRRP

# neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
+--------------------------------------+--------------------------+----------------+-------+----------+
| id                                   | host                     | admin_state_up | alive | ha_state |
+--------------------------------------+--------------------------+----------------+-------+----------+
| f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | standby  |
| bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | :-)   | active   |
+--------------------------------------+--------------------------+----------------+-------+----------+

When I reboot the active node, in this case neutron02 (shutdown -r now).
I still watching this:

# neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
+--------------------------------------+--------------------------+----------------+-------+----------+
| id                                   | host                     | admin_state_up | alive | ha_state |
+--------------------------------------+--------------------------+----------------+-------+----------+
| f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | active   |
| bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | xxx   | active   |
+--------------------------------------+--------------------------+----------------+-------+----------+

In this moment, the HA works properly and I still have connections in my
virtual machines.

The problem appears two minutes later, when the neutron02 is up again

# neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
+--------------------------------------+--------------------------+----------------+-------+----------+
| id                                   | host                     | admin_state_up | alive | ha_state |
+--------------------------------------+--------------------------+----------------+-------+----------+
| f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | active   |
| bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | :-)   | standby  |
+--------------------------------------+--------------------------+----------------+-------+----------+

And then

 neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
+--------------------------------------+--------------------------+----------------+-------+----------+
| id                                   | host                     | admin_state_up | alive | ha_state |
+--------------------------------------+--------------------------+----------------+-------+----------+
| f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | standby  |
| bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | :-)   | active   |
+--------------------------------------+--------------------------+----------------+-------+----------+
[root@prod-oscontroller01 ~(keystone_admin)]# 

The neutron02 become itself in Master again, so I lost connection with
my virtual machines.

If I see the state of FlotingIP's port in the router:

# neutron port-show 54fef3f2-3d13-4575-9bb2-4491ae4896d7
+-----------------------+-------------------------------------------------------------------------------------+
| Field                 | Value                                                                               |
+-----------------------+-------------------------------------------------------------------------------------+
| admin_state_up        | True                                                                                |
| allowed_address_pairs |                                                                                     |
| binding:host_id       | prod-neutron02                                                           |
| binding:profile       | {}                                                                                  |
| binding:vif_details   | {}                                                                                  |
| binding:vif_type      | binding_failed                                                                      |
| binding:vnic_type     | normal                                                                              |
| device_id             | 6a69d00a-3e42-46da-b38a-1bbedc873c92                                                |
| device_owner          | network:router_gateway                                                              |
| extra_dhcp_opts       |                                                                                     |
| fixed_ips             | {"subnet_id": "415b44b8-a2a5-4a43-a87d-6aee9b558063", "ip_address": "10.100.44.87"} |
| id                    | 54fef3f2-3d13-4575-9bb2-4491ae4896d7                                                |
| mac_address           | fa:16:3e:df:e5:45                                                                   |
| name                  |                                                                                     |
| network_id            | 5bb1eb0f-19ad-4a62-bf6e-19ac6dacff15                                                |
| security_groups       |                                                                                     |
| status                | ACTIVE                                                                              |
| tenant_id             |                                                                                     |
+-----------------------+-------------------------------------------------------------------------------------+
[root@prod-oscontroller01 ~(keystone_admin)]# 


I see 'binding_failed'

To solve this problem, I've restarted the services 'systemctl restart
neutron-l3-agent neuton-openvswitch-agent' in both neutrons. I've found
this workaround in this ticket
https://bugs.launchpad.net/neutron/+bug/1488619

In my nodes selinux and Iptables are disabled.

I don't know what is the problem?. Greetings!

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: l3-ha

** Summary changed:

- Neutron does not work properly in HA VRRP
+ Neutron does not work properly L3 HA VRRP

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1493132

Title:
  Neutron does not work properly L3 HA VRRP

Status in neutron:
  New

Bug description:
  I'm using Openstack Kilo in Redhat 7. 
  Version of Neutron package in RDO: openstack-neutron-2015.1.1-1.el7.noarch

  I have two nodes of Neutron in which I running neutron-l3-agent. I've
  configured HA with VRRP

  # neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | id                                   | host                     | admin_state_up | alive | ha_state |
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | standby  |
  | bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | :-)   | active   |
  +--------------------------------------+--------------------------+----------------+-------+----------+

  When I reboot the active node, in this case neutron02 (shutdown -r
  now). I still watching this:

  # neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | id                                   | host                     | admin_state_up | alive | ha_state |
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | active   |
  | bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | xxx   | active   |
  +--------------------------------------+--------------------------+----------------+-------+----------+

  In this moment, the HA works properly and I still have connections in
  my virtual machines.

  The problem appears two minutes later, when the neutron02 is up again

  # neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | id                                   | host                     | admin_state_up | alive | ha_state |
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | active   |
  | bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | :-)   | standby  |
  +--------------------------------------+--------------------------+----------------+-------+----------+

  And then

   neutron l3-agent-list-hosting-router 6a69d00a-3e42-46da-b38a-1bbedc873c92
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | id                                   | host                     | admin_state_up | alive | ha_state |
  +--------------------------------------+--------------------------+----------------+-------+----------+
  | f16f472a-93b1-44bf-9eda-ecf40866795f | prod-neutron01 | True           | :-)   | standby  |
  | bb35d988-7009-497c-a791-92e057ae7679 | prod-neutron02 | True           | :-)   | active   |
  +--------------------------------------+--------------------------+----------------+-------+----------+
  [root@prod-oscontroller01 ~(keystone_admin)]# 

  The neutron02 become itself in Master again, so I lost connection with
  my virtual machines.

  If I see the state of FlotingIP's port in the router:

  # neutron port-show 54fef3f2-3d13-4575-9bb2-4491ae4896d7
  +-----------------------+-------------------------------------------------------------------------------------+
  | Field                 | Value                                                                               |
  +-----------------------+-------------------------------------------------------------------------------------+
  | admin_state_up        | True                                                                                |
  | allowed_address_pairs |                                                                                     |
  | binding:host_id       | prod-neutron02                                                           |
  | binding:profile       | {}                                                                                  |
  | binding:vif_details   | {}                                                                                  |
  | binding:vif_type      | binding_failed                                                                      |
  | binding:vnic_type     | normal                                                                              |
  | device_id             | 6a69d00a-3e42-46da-b38a-1bbedc873c92                                                |
  | device_owner          | network:router_gateway                                                              |
  | extra_dhcp_opts       |                                                                                     |
  | fixed_ips             | {"subnet_id": "415b44b8-a2a5-4a43-a87d-6aee9b558063", "ip_address": "10.100.44.87"} |
  | id                    | 54fef3f2-3d13-4575-9bb2-4491ae4896d7                                                |
  | mac_address           | fa:16:3e:df:e5:45                                                                   |
  | name                  |                                                                                     |
  | network_id            | 5bb1eb0f-19ad-4a62-bf6e-19ac6dacff15                                                |
  | security_groups       |                                                                                     |
  | status                | ACTIVE                                                                              |
  | tenant_id             |                                                                                     |
  +-----------------------+-------------------------------------------------------------------------------------+
  [root@prod-oscontroller01 ~(keystone_admin)]# 

  
  I see 'binding_failed'

  To solve this problem, I've restarted the services 'systemctl restart
  neutron-l3-agent neuton-openvswitch-agent' in both neutrons. I've
  found this workaround in this ticket
  https://bugs.launchpad.net/neutron/+bug/1488619

  In my nodes selinux and Iptables are disabled.

  I don't know what is the problem?. Greetings!

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1493132/+subscriptions