← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1917409] [NEW] neutron-l3-agents won't become active

 

Public bug reported:

We have a Ubuntu Ussari cloud deployed on Ubuntu 20.04 using the juju
charms from the 20.08 bundle (planning to upgrade soon).

The problem that is occuring that all l3 agents for routers using a
particular external network show up with their ha_state in standby.
I've tried removing and re-adding, and we never see the state go to
active.

$ neutron l3-agent-list-hosting-router bradm-router
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+-------------+----------------+-------+----------+
| id                                   | host        | admin_state_up | alive | ha_state |
+--------------------------------------+-------------+----------------+-------+----------+
| 09ae92c9-ae8f-4209-b1a8-d593cc6d6602 | oschv1.maas | True           | :-)   | standby  |
| 4d9fe934-b1f8-4c2b-83ea-04971f827209 | oschv2.maas | True           | :-)   | standby  |
| 70b8b60e-7fbd-4b3a-80a3-90875ca72ce6 | oschv4.maas | True           | :-)   | standby  |
+--------------------------------------+-------------+----------------+-------+----------+

This generates a stack trace:

2021-03-01 02:59:47.344 3675486 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get'
Traceback (most recent call last):

  File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
    res = self.dispatcher.dispatch(message)

  File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch
    result = func(ctxt, **new_args)

  File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped
    setattr(e, '_RETRY_EXCEEDED', True)

  File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()

  File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
    raise value

  File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped
    return f(*args, **kwargs)

  File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper
    ectxt.value = e.inner_exc

  File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()

  File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
    raise value

  File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper
    return f(*args, **kwargs)

  File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped
    LOG.debug("Retry wrapper got retriable exception: %s", e)

  File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()

  File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
    raise value

  File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped
    return f(*dup_args, **dup_kwargs)

  File "/usr/lib/python3/dist-packages/neutron/api/rpc/handlers/l3_rpc.py", line 306, in get_agent_gateway_port
    agent_port = self.l3plugin.create_fip_agent_gw_port_if_not_exists(

  File "/usr/lib/python3/dist-packages/neutron/db/l3_dvr_db.py", line 1101, in create_fip_agent_gw_port_if_not_exists
    self._populate_mtu_and_subnets_for_ports(context, [agent_port])

  File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in _populate_mtu_and_subnets_for_ports
    network_ids = [p['network_id']

  File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in <listcomp>
    network_ids = [p['network_id']

  File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1720, in _each_port_having_fixed_ips
    fixed_ips = port.get('fixed_ips', [])

This system was running successfully after deployment, and has been left
running for a while and when it was revisited was in this state.  I've
been unable to successfully debug what has caused it to be in this
state.

Versions:
Ubuntu 20.04
Juju charms 20.08
Openstack ussari
Environment: Clustered services using containers on converged hypervisors

$ dpkg-query -W neutron-common
neutron-common  2:16.2.0-0ubuntu2

Please let me know if there is any further information that could be
used to see what is happening here.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1917409

Title:
  neutron-l3-agents won't become active

Status in neutron:
  New

Bug description:
  We have a Ubuntu Ussari cloud deployed on Ubuntu 20.04 using the juju
  charms from the 20.08 bundle (planning to upgrade soon).

  The problem that is occuring that all l3 agents for routers using a
  particular external network show up with their ha_state in standby.
  I've tried removing and re-adding, and we never see the state go to
  active.

  $ neutron l3-agent-list-hosting-router bradm-router
  neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
  +--------------------------------------+-------------+----------------+-------+----------+
  | id                                   | host        | admin_state_up | alive | ha_state |
  +--------------------------------------+-------------+----------------+-------+----------+
  | 09ae92c9-ae8f-4209-b1a8-d593cc6d6602 | oschv1.maas | True           | :-)   | standby  |
  | 4d9fe934-b1f8-4c2b-83ea-04971f827209 | oschv2.maas | True           | :-)   | standby  |
  | 70b8b60e-7fbd-4b3a-80a3-90875ca72ce6 | oschv4.maas | True           | :-)   | standby  |
  +--------------------------------------+-------------+----------------+-------+----------+

  This generates a stack trace:

  2021-03-01 02:59:47.344 3675486 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get'
  Traceback (most recent call last):

    File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
      res = self.dispatcher.dispatch(message)

    File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)

    File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch
      result = func(ctxt, **new_args)

    File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped
      setattr(e, '_RETRY_EXCEEDED', True)

    File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()

    File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
      raise value

    File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped
      return f(*args, **kwargs)

    File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper
      ectxt.value = e.inner_exc

    File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()

    File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
      raise value

    File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper
      return f(*args, **kwargs)

    File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped
      LOG.debug("Retry wrapper got retriable exception: %s", e)

    File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()

    File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
      raise value

    File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped
      return f(*dup_args, **dup_kwargs)

    File "/usr/lib/python3/dist-packages/neutron/api/rpc/handlers/l3_rpc.py", line 306, in get_agent_gateway_port
      agent_port = self.l3plugin.create_fip_agent_gw_port_if_not_exists(

    File "/usr/lib/python3/dist-packages/neutron/db/l3_dvr_db.py", line 1101, in create_fip_agent_gw_port_if_not_exists
      self._populate_mtu_and_subnets_for_ports(context, [agent_port])

    File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in _populate_mtu_and_subnets_for_ports
      network_ids = [p['network_id']

    File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in <listcomp>
      network_ids = [p['network_id']

    File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1720, in _each_port_having_fixed_ips
      fixed_ips = port.get('fixed_ips', [])

  This system was running successfully after deployment, and has been
  left running for a while and when it was revisited was in this state.
  I've been unable to successfully debug what has caused it to be in
  this state.

  Versions:
  Ubuntu 20.04
  Juju charms 20.08
  Openstack ussari
  Environment: Clustered services using containers on converged hypervisors

  $ dpkg-query -W neutron-common
  neutron-common  2:16.2.0-0ubuntu2

  Please let me know if there is any further information that could be
  used to see what is happening here.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1917409/+subscriptions


Follow ups