yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #38300
[Bug 1493945] [NEW] Router scheduling at network node fails under scale
Public bug reported:
After around 100 routers being scheduled to a neutron node, subsequent
schedulings fail with the following extracted signature:
38343:2015-09-09 06:53:15.305 mDEBUG neutron.agent.l3.agent [req-d7ce10e2-b689-4c5b-b4c7-30aa4f1fdbbb admin cdd316b857a947488ca9120aef5f6891m] Got routers updated notification :[u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1'] from (pid=19102) routers_updated /opt/stack/neutron/neutron/agent/l3/agent.py:385
38448:2015-09-09 06:53:16.328 mDEBUG neutron.agent.l3.agent [req-63d36e16-5d5d-4575-825b-28722ec28a1e admin cdd316b857a947488ca9120aef5f6891m] Got routers updated notification :[u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1'] from (pid=19102) routers_updated /opt/stack/neutron/neutron/agent/l3/agent.py:385
41013:2015-09-09 06:54:23.815 mDEBUG neutron.agent.l3.agent [-m] Starting router update for 54ffc2c4-123b-460b-bd2f-01ae5277e3e1, action None, priority 0 from (pid=19102) _process_router_update /opt/stack/neutron/neutron/agent/l3/agent.py:456
42690:2015-09-09 06:55:23.818 ERROR neutron.agent.l3.agent [-] Failed to fetch router information for '54ffc2c4-123b-460b-bd2f-01ae5277e3e1'
42710:2015-09-09 06:55:23.821 mDEBUG neutron.agent.l3.agent [-m] Starting router update for 54ffc2c4-123b-460b-bd2f-01ae5277e3e1, action None, priority 0 from (pid=19102) _process_router_update /opt/stack/neutron/neutron/agent/l3/agent.py:456
42738:2015-09-09 06:55:30.615 mDEBUG oslo_messaging._drivers.amqpdriver [-m] queues: 8, message: {u'_unique_id': u'c3f0a880f9544bf8b938bb6ced4fee6f', u'failure': None, u'result': [{u'status': u'ACTIVE', u'_interfaces': [{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:router_interface', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'3d01720b-324d-4f69-8767-43705217aeb0', u'prefixlen': 24, u'ip_address': u'192.168.18.1'}], u'id': u'7ef5df56-e82b-4fb8-8b1c-836ec93338d3', u'security_groups': [], u'binding:vif_details': {}, u'binding:vif_type': u'unbound', u'mac_address': u'fa:16:3e:ee:9f:33', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'192.168.18.0/24', u'gateway_ip': u'192.168.18.1', u'id': u'3d01720b-324d-4f69-8767-43705217aeb0', u'subnetpool_id': None}], u'binding:host_id': u'legacy-network-1', u'device_id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'', u'admin_state_up': True, u'network_id': u'7a77e6c2-6e25-4223-9981-987f33e75d18', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'4cd3d0ecfa6f48bb946932481ef04b4e', u'extra_subnets': []}], u'enable_snat': True, u'ha_vr_id': 0, u'gw_port_host': None, u'gw_port_id': u'2a3dabbc-db24-40c5-880a-3ef738537520', u'admin_state_up': True, u'tenant_id': u'4cd3d0ecfa6f48bb946932481ef04b4e', u'gw_port': {u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:router_gateway', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'prefixlen': 20, u'ip_address': u'172.18.128.101'}], u'id': u'2a3dabbc-db24-40c5-880a-3ef738537520', u'security_groups': [], u'binding:vif_details': {}, u'binding:vif_type': u'unbound', u'mac_address': u'fa:16:3e:b5:fa:de', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'172.18.128.0/20', u'gateway_ip': u'172.18.128.1', u'id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'subnetpool_id': None}], u'binding:host_id': u'legacy-network-1', u'device_id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'', u'admin_state_up': True, u'network_id': u'c546009b-207c-44cd-8a4b-3e1e426eb56b', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'', u'extra_subnets': []}, u'distributed': False, u'_snat_router_interfaces': [], u'routes': [], u'external_gateway_info': {u'network_id': u'c546009b-207c-44cd-8a4b-3e1e426eb56b', u'enable_snat': True, u'external_fixed_ips': [{u'subnet_id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'ip_address': u'172.18.128.101'}]}, u'ha': False, u'id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'router-100'}]} from (pid=19102) put /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:230
42921:2015-09-09 06:56:23.824 ERROR neutron.agent.l3.agent [-] Failed to fetch router information for '54ffc2c4-123b-460b-bd2f-01ae5277e3e1'
The failure above comes from oslo_messaging timing out while getting
router information at line 465 in _process_router_update. However, the
status of the now unscheduled router is still show as ACTIVE by the
neutron server, so no one will know about the failure.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1493945
Title:
Router scheduling at network node fails under scale
Status in neutron:
New
Bug description:
After around 100 routers being scheduled to a neutron node, subsequent
schedulings fail with the following extracted signature:
38343:2015-09-09 06:53:15.305 mDEBUG neutron.agent.l3.agent [req-d7ce10e2-b689-4c5b-b4c7-30aa4f1fdbbb admin cdd316b857a947488ca9120aef5f6891m] Got routers updated notification :[u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1'] from (pid=19102) routers_updated /opt/stack/neutron/neutron/agent/l3/agent.py:385
38448:2015-09-09 06:53:16.328 mDEBUG neutron.agent.l3.agent [req-63d36e16-5d5d-4575-825b-28722ec28a1e admin cdd316b857a947488ca9120aef5f6891m] Got routers updated notification :[u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1'] from (pid=19102) routers_updated /opt/stack/neutron/neutron/agent/l3/agent.py:385
41013:2015-09-09 06:54:23.815 mDEBUG neutron.agent.l3.agent [-m] Starting router update for 54ffc2c4-123b-460b-bd2f-01ae5277e3e1, action None, priority 0 from (pid=19102) _process_router_update /opt/stack/neutron/neutron/agent/l3/agent.py:456
42690:2015-09-09 06:55:23.818 ERROR neutron.agent.l3.agent [-] Failed to fetch router information for '54ffc2c4-123b-460b-bd2f-01ae5277e3e1'
42710:2015-09-09 06:55:23.821 mDEBUG neutron.agent.l3.agent [-m] Starting router update for 54ffc2c4-123b-460b-bd2f-01ae5277e3e1, action None, priority 0 from (pid=19102) _process_router_update /opt/stack/neutron/neutron/agent/l3/agent.py:456
42738:2015-09-09 06:55:30.615 mDEBUG oslo_messaging._drivers.amqpdriver [-m] queues: 8, message: {u'_unique_id': u'c3f0a880f9544bf8b938bb6ced4fee6f', u'failure': None, u'result': [{u'status': u'ACTIVE', u'_interfaces': [{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:router_interface', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'3d01720b-324d-4f69-8767-43705217aeb0', u'prefixlen': 24, u'ip_address': u'192.168.18.1'}], u'id': u'7ef5df56-e82b-4fb8-8b1c-836ec93338d3', u'security_groups': [], u'binding:vif_details': {}, u'binding:vif_type': u'unbound', u'mac_address': u'fa:16:3e:ee:9f:33', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'192.168.18.0/24', u'gateway_ip': u'192.168.18.1', u'id': u'3d01720b-324d-4f69-8767-43705217aeb0', u'subnetpool_id': None}], u'binding:host_id': u'legacy-network-1', u'device_id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'', u'admin_state_up': True, u'network_id': u'7a77e6c2-6e25-4223-9981-987f33e75d18', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'4cd3d0ecfa6f48bb946932481ef04b4e', u'extra_subnets': []}], u'enable_snat': True, u'ha_vr_id': 0, u'gw_port_host': None, u'gw_port_id': u'2a3dabbc-db24-40c5-880a-3ef738537520', u'admin_state_up': True, u'tenant_id': u'4cd3d0ecfa6f48bb946932481ef04b4e', u'gw_port': {u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:router_gateway', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'prefixlen': 20, u'ip_address': u'172.18.128.101'}], u'id': u'2a3dabbc-db24-40c5-880a-3ef738537520', u'security_groups': [], u'binding:vif_details': {}, u'binding:vif_type': u'unbound', u'mac_address': u'fa:16:3e:b5:fa:de', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'172.18.128.0/20', u'gateway_ip': u'172.18.128.1', u'id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'subnetpool_id': None}], u'binding:host_id': u'legacy-network-1', u'device_id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'', u'admin_state_up': True, u'network_id': u'c546009b-207c-44cd-8a4b-3e1e426eb56b', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'', u'extra_subnets': []}, u'distributed': False, u'_snat_router_interfaces': [], u'routes': [], u'external_gateway_info': {u'network_id': u'c546009b-207c-44cd-8a4b-3e1e426eb56b', u'enable_snat': True, u'external_fixed_ips': [{u'subnet_id': u'd43deb2a-6bcd-40b2-b559-36a798e932ba', u'ip_address': u'172.18.128.101'}]}, u'ha': False, u'id': u'54ffc2c4-123b-460b-bd2f-01ae5277e3e1', u'name': u'router-100'}]} from (pid=19102) put /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:230
42921:2015-09-09 06:56:23.824 ERROR neutron.agent.l3.agent [-] Failed to fetch router information for '54ffc2c4-123b-460b-bd2f-01ae5277e3e1'
The failure above comes from oslo_messaging timing out while getting
router information at line 465 in _process_router_update. However,
the status of the now unscheduled router is still show as ACTIVE by
the neutron server, so no one will know about the failure.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1493945/+subscriptions
Follow ups