yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #55215
[Bug 1614337] [NEW] L3 agent fails on FIP when DVR and HA both enabled in router
Public bug reported:
I have a vlan-based Neutron configuration. My tenant networks are
vlans, and my shared external network (br-ex) is a flat network.
Neutron is configured for DVR+SNAT mode. In testing floating IPs, I've
run into issues with my neutron router, and I've traced it back to a
single scenario: when the router is both distributed AND ha. To be
clear, I've tested all four possibilities:
"--distributed False --ha False" == works
"--distributed True --ha False" == works
"--distributed False --ha True" == works
"--distributed True --ha True" == fails
* I can reproduce this again and again, just by deleting the router I
have (which implies first clearing its gateway, and removing any
associated ports), then re-creating the router in any of the four
configurations above. Then I boot some VMs, associate a FIP to any one
of them, and attempt to reach the FIP. Results are the same whether I
create the router on the CLI or from within Horizon.
* Expected result is that I should be able to associate a floating IP to
a running VM and then ping that floating IP (and ultimately other kinds
of activity, such as SSH access to the VM).
* Actual result is that the floating IP is completely unreachable from other valid IPs within same L2 space. Additionally, in /var/log/neutron/l3-agent.log on the compute node hosting the VM whose associated FIP I can't reach, I find this:
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '13356ddb-8e36-4f54-b8b2-6a62a5aecf86'
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in _process_router_update
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 440, in _process_router_if_compatible
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_updated_router(router)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 454, in _process_updated_router
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent ri.process(self)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_edge_ha_router.py", line 92, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrEdgeHaRouter, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 488, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrLocalRouter, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_router_base.py", line 30, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrRouterBase, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 386, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(HaRouter, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 385, in call
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.logger(e)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.force_reraise()
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 382, in call
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent return func(*args, **kwargs)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 961, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_internal_ports(agent.pd)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 478, in _process_internal_ports
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.internal_network_added(p)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_edge_ha_router.py", line 58, in internal_network_added
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent dvr_snat_ns.SNAT_INT_DEV_PREFIX)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 280, in _plug_ha_router_port
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._disable_ipv6_addressing_on_interface(interface_name)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 239, in _disable_ipv6_addressing_on_interface
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent if self._should_delete_ipv6_lladdr(ipv6_lladdr):
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 221, in _should_delete_ipv6_lladdr
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent if manager.get_process().active:
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'get_process'
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent
* Version
** CentOS 7.2
** Kernel 3.10.0-327.18.2.el7.x86_64
** Mitaka from RDO rpms, puppet managed
** Neutron RPMS:
openstack-neutron-8.1.2-1.el7.noarch
openstack-neutron-common-8.1.2-1.el7.noarch
openstack-neutron-fwaas-8.0.0-3.el7.noarch
openstack-neutron-ml2-8.1.2-1.el7.noarch
openstack-neutron-openvswitch-8.1.2-1.el7.noarch
python-neutron-8.1.2-1.el7.noarch
python-neutronclient-4.1.1-2.el7.noarch
python-neutron-fwaas-8.0.0-3.el7.noarch
python-neutron-lib-0.0.2-1.el7.noarch
* Environment
* 1 controller (running neutron-server, but no other neutron components)
* 2 dedicated network nodes for neutron agents
* N compute nodes running neutron l3-agent because of dvr_snat mode
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1614337
Title:
L3 agent fails on FIP when DVR and HA both enabled in router
Status in neutron:
New
Bug description:
I have a vlan-based Neutron configuration. My tenant networks are
vlans, and my shared external network (br-ex) is a flat network.
Neutron is configured for DVR+SNAT mode. In testing floating IPs,
I've run into issues with my neutron router, and I've traced it back
to a single scenario: when the router is both distributed AND ha. To
be clear, I've tested all four possibilities:
"--distributed False --ha False" == works
"--distributed True --ha False" == works
"--distributed False --ha True" == works
"--distributed True --ha True" == fails
* I can reproduce this again and again, just by deleting the router I
have (which implies first clearing its gateway, and removing any
associated ports), then re-creating the router in any of the four
configurations above. Then I boot some VMs, associate a FIP to any
one of them, and attempt to reach the FIP. Results are the same
whether I create the router on the CLI or from within Horizon.
* Expected result is that I should be able to associate a floating IP
to a running VM and then ping that floating IP (and ultimately other
kinds of activity, such as SSH access to the VM).
* Actual result is that the floating IP is completely unreachable from other valid IPs within same L2 space. Additionally, in /var/log/neutron/l3-agent.log on the compute node hosting the VM whose associated FIP I can't reach, I find this:
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '13356ddb-8e36-4f54-b8b2-6a62a5aecf86'
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in _process_router_update
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 440, in _process_router_if_compatible
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_updated_router(router)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 454, in _process_updated_router
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent ri.process(self)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_edge_ha_router.py", line 92, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrEdgeHaRouter, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 488, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrLocalRouter, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_router_base.py", line 30, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrRouterBase, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 386, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(HaRouter, self).process(agent)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 385, in call
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.logger(e)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.force_reraise()
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 382, in call
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent return func(*args, **kwargs)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 961, in process
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_internal_ports(agent.pd)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 478, in _process_internal_ports
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.internal_network_added(p)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_edge_ha_router.py", line 58, in internal_network_added
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent dvr_snat_ns.SNAT_INT_DEV_PREFIX)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 280, in _plug_ha_router_port
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._disable_ipv6_addressing_on_interface(interface_name)
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 239, in _disable_ipv6_addressing_on_interface
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent if self._should_delete_ipv6_lladdr(ipv6_lladdr):
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 221, in _should_delete_ipv6_lladdr
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent if manager.get_process().active:
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'get_process'
2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent
* Version
** CentOS 7.2
** Kernel 3.10.0-327.18.2.el7.x86_64
** Mitaka from RDO rpms, puppet managed
** Neutron RPMS:
openstack-neutron-8.1.2-1.el7.noarch
openstack-neutron-common-8.1.2-1.el7.noarch
openstack-neutron-fwaas-8.0.0-3.el7.noarch
openstack-neutron-ml2-8.1.2-1.el7.noarch
openstack-neutron-openvswitch-8.1.2-1.el7.noarch
python-neutron-8.1.2-1.el7.noarch
python-neutronclient-4.1.1-2.el7.noarch
python-neutron-fwaas-8.0.0-3.el7.noarch
python-neutron-lib-0.0.2-1.el7.noarch
* Environment
* 1 controller (running neutron-server, but no other neutron components)
* 2 dedicated network nodes for neutron agents
* N compute nodes running neutron l3-agent because of dvr_snat mode
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1614337/+subscriptions
Follow ups