← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1493945] Re: Router scheduling at network node fails under scale

 

Unable to reproduce. May have been fixed by bugs closed in comment #3.

** Changed in: neutron
       Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1493945

Title:
  Router scheduling at network node fails under scale

Status in neutron:
  Invalid

Bug description:
  After around 100 routers being scheduled to a neutron node, subsequent
  schedulings fail with the following extracted signature:

  38343:2015-09-09 06:53:15.305 mDEBUG neutron.agent.l3.agent [req-d7ce10e2-b689-4c5b-b4c7-30aa4f1fdbbb admin cdd316b857a947488ca9120aef5f6891m] Got routers updated notification :[u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1'] from (pid=19102) routers_updated /opt/stack/neutron/neutron/agent/l3/agent.py:385
  38448:2015-09-09 06:53:16.328 mDEBUG neutron.agent.l3.agent [req-63d36e16-5d5d-4575-825b-28722ec28a1e admin cdd316b857a947488ca9120aef5f6891m] Got routers updated notification :[u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1'] from (pid=19102) routers_updated /opt/stack/neutron/neutron/agent/l3/agent.py:385
  41013:2015-09-09 06:54:23.815 mDEBUG neutron.agent.l3.agent [-m] Starting router update for 54ffc2c4-123b-460b-bd2f-01ae5277e3e1, action None, priority 0 from (pid=19102) _process_router_update /opt/stack/neutron/neutron/agent/l3/agent.py:456
  42690:2015-09-09 06:55:23.818 ERROR neutron.agent.l3.agent [-] Failed to fetch router information for '54ffc2c4-123b-460b-bd2f-01ae5277e3e1'
  42710:2015-09-09 06:55:23.821 mDEBUG neutron.agent.l3.agent [-m] Starting router update for 54ffc2c4-123b-460b-bd2f-01ae5277e3e1, action None, priority 0 from (pid=19102) _process_router_update /opt/stack/neutron/neutron/agent/l3/agent.py:456
  42738:2015-09-09 06:55:30.615 mDEBUG oslo_messaging._drivers.amqpdriver [-m]  queues: 8, message: {u'_unique_id': u'c3f0a880f9544bf8b938bb6ced4fee6f', u'failure': None, u'result': [{u'status': u'ACTIVE', u'_interfaces': [{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:router_interface', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'3d01720b-324d-4f69-8767-43705217aeb0', u'prefixlen': 24, u'ip_address': u'192.168.18.1'}], u'id': u'7ef5df56-e82b-4fb8-8b1c-836ec93338d3', u'security_groups': [], u'binding:vif_details': {}, u'binding:vif_type': u'unbound', u'mac_address': u'fa:16:3e:ee:9f:33', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'192.168.18.0/24', u'gateway_ip': u'192.168.18.1', u'id': u'3d01720b-324d-4f69-8767-43705217aeb0', u'subnetpool_id': None}], u'binding:host_id': u'legacy-network-1', u'device_id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'', u'admin_state_up': True, u'network_id': u'7a77e6c2-6e25-4223-9981-987f33e75d18', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'4cd3d0ecfa6f48bb946932481ef04b4e', u'extra_subnets': []}], u'enable_snat': True, u'ha_vr_id': 0, u'gw_port_host': None, u'gw_port_id': u'2a3dabbc-db24-40c5-880a-3ef738537520', u'admin_state_up': True, u'tenant_id': u'4cd3d0ecfa6f48bb946932481ef04b4e', u'gw_port': {u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:router_gateway', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'prefixlen': 20, u'ip_address': u'172.18.128.101'}], u'id': u'2a3dabbc-db24-40c5-880a-3ef738537520', u'security_groups': [], u'binding:vif_details': {}, u'binding:vif_type': u'unbound', u'mac_address': u'fa:16:3e:b5:fa:de', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'172.18.128.0/20', u'gateway_ip': u'172.18.128.1', u'id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'subnetpool_id': None}], u'binding:host_id': u'legacy-network-1', u'device_id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'', u'admin_state_up': True, u'network_id': u'c546009b-207c-44cd-8a4b-3e1e426eb56b', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'', u'extra_subnets': []}, u'distributed': False, u'_snat_router_interfaces': [], u'routes': [], u'external_gateway_info': {u'network_id': u'c546009b-207c-44cd-8a4b-3e1e426eb56b', u'enable_snat': True, u'external_fixed_ips': [{u'subnet_id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'ip_address': u'172.18.128.101'}]}, u'ha': False, u'id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'router-100'}]} from (pid=19102) put /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:230
  42921:2015-09-09 06:56:23.824 ERROR neutron.agent.l3.agent [-] Failed to fetch router information for '54ffc2c4-123b-460b-bd2f-01ae5277e3e1'

  The failure above comes from oslo_messaging timing out while getting
  router information at line 465 in _process_router_update.  However,
  the status of the now unscheduled router is still show as ACTIVE by
  the neutron server, so no one will know about the failure.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1493945/+subscriptions


References