yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #77397
[Bug 1817956] Re: Metadata not reachable when dvr_snat L3 agent is used on compute node
Reviewed: https://review.openstack.org/639979
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6ae228cc2e75504d9a8f35e3480a66707f9d7246
Submitter: Zuul
Branch: master
commit 6ae228cc2e75504d9a8f35e3480a66707f9d7246
Author: Slawek Kaplonski <skaplons@xxxxxxxxxx>
Date: Thu Feb 28 11:35:07 2019 +0100
Spawn metadata proxy on dvr ha standby routers
In case when L3 agent is running in dvr_snat mode on compute node,
it is like that e.g. in some of the gate jobs, it may happen that
same router is scheduled to be in standby mode on compute node and
on same compute node there is instance connected to it.
So in such case metadata proxy needs to be spawned in router namespace
even if it is in standby mode.
Change-Id: Id646ab2c184c7a1d5ac38286a0162dd37d72df6e
Closes-Bug: #1817956
Closes-Bug: #1606741
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817956
Title:
Metadata not reachable when dvr_snat L3 agent is used on compute node
Status in neutron:
Fix Released
Bug description:
In case when L3 agents are deployed on compute nodes in dvr_snat agent
mode (that is e.g. in CI jobs) and dvr ha is used it may happen that
metadata will not be reachable from instances.
For example, as it is in neutron-tempest-dvr-ha-multinode-full job, we
have:
- controller (all in one) with L3 agent in dvr mode,
- compute-1 with L3 agent in dvr_snat mode,
- compute-2 with L3 agent in dvr_snat mode.
Now, if VM will be scheduled e.g. on host compute-2 and it will be
connected to dvr+ha router which is scheduled to be Active on
compute-1 and standby on compute-2 node, than on compute-2 metadata
haproxy will not be spawned and VM will not be able to reach metadata
IP.
I found it when I tried to migrate existing legacy neutron-tempest-dvr-ha-multinode-full job to zuulv3. I found that legacy job is in fact "nonHA" job because "l3_ha" option is set there to False and because of that routers are created as nonHA dvr routers.
When I switched it to be dvr+ha in https://review.openstack.org/#/c/633979/ I spotted this error described above.
Example of failed tests http://logs.openstack.org/79/633979/16/check
/neutron-tempest-dvr-ha-multinode-full/710fb3d/job-output.txt.gz - all
VMs which SSH wasn't possible, can't reach metadata IP.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1817956/+subscriptions
References