yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #71659
[Bug 1755602] [NEW] Ironic computes may not be discovered when node count is less than compute count
Public bug reported:
In an ironic deployment being built from day zero, there is an ordering
problem, which generates a race condition for operators. Consider this
common example:
At config time, you create and start three nova-compute services
pointing at your ironic deployment. These three will be HA using the
ironic driver's hash ring functionality. At config time, there are no
ironic nodes present yet, which means running discover_hosts will create
no host mappings.
Next, a single ironic node is added, which is owned by one of the
computes per the hash rules. At this point, you can run discover_hosts
and whatever compute owns that node will get a host mapping. Then you
add a second ironic node, which causes all three nova-computes to
rebalance the hash ring. One or more of the ironic nodes will definitely
land on one of the other nova-computes and will suddenly be unreachable
because there is no host mapping until the next time discover_hosts is
run. Since we track the "mapped" bit on compute nodes, and compute nodes
move between hosts with ironic, we won't even notice that the new owner
nova-compute needs a host mapping. In fact, we won't notice until we get
lucky enough to land a never-mapped ironic node on a nova-compute for
the first time and then run discover_hosts after that point.
For an automated config management system, this is a lot of complexity
to handle in order to generate a stable output of a working system. In
many cases where you're using ironic to bootstrap another deployment
(i.e. tripleo) the number of nodes may be small (less than the computes)
for quite some time.
There are a couple obvious options I see:
1. Add a --and-services flag to nova-manage, which will also look for
all nova-compute services in the cell and make sure those have mappings.
This is ideal because we could get all services mapped at config time
without even having to have an ironic node in place yet (which is not
possible today). We can't do this efficiently right away because
nova.services does not have a mapped flag, and thus the scheduler
periodic should _not_ include services.
2. We could unset compute_node.mapped any time we re-home an ironic node
to a different nova-compute. This would cause our scheduler periodic to
notice the change and create a host mapping if it happens to move to an
unmapped nova-compute. This generates extra work during normal operating
state and also still leaves us with an interval of time where a
previously-usable ironic node becomes unusable until the host discovery
periodic task runs again.
IMHO, we should do #1. It's a backportable change, and it's actually a
better workflow for config automation tools than what we have today,
even discounting this race. We can do what we did before, which is do it
once for backports, and then add a mapped bit in master to make it more
efficient, allowing it to be included in the scheduler periodic task.
** Affects: nova
Importance: Medium
Assignee: Dan Smith (danms)
Status: Confirmed
** Tags: cells
** Changed in: nova
Importance: Undecided => Medium
** Changed in: nova
Status: New => Confirmed
** Bug watch added: Red Hat Bugzilla #1554460
https://bugzilla.redhat.com/show_bug.cgi?id=1554460
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1755602
Title:
Ironic computes may not be discovered when node count is less than
compute count
Status in OpenStack Compute (nova):
Confirmed
Bug description:
In an ironic deployment being built from day zero, there is an
ordering problem, which generates a race condition for operators.
Consider this common example:
At config time, you create and start three nova-compute services
pointing at your ironic deployment. These three will be HA using the
ironic driver's hash ring functionality. At config time, there are no
ironic nodes present yet, which means running discover_hosts will
create no host mappings.
Next, a single ironic node is added, which is owned by one of the
computes per the hash rules. At this point, you can run discover_hosts
and whatever compute owns that node will get a host mapping. Then you
add a second ironic node, which causes all three nova-computes to
rebalance the hash ring. One or more of the ironic nodes will
definitely land on one of the other nova-computes and will suddenly be
unreachable because there is no host mapping until the next time
discover_hosts is run. Since we track the "mapped" bit on compute
nodes, and compute nodes move between hosts with ironic, we won't even
notice that the new owner nova-compute needs a host mapping. In fact,
we won't notice until we get lucky enough to land a never-mapped
ironic node on a nova-compute for the first time and then run
discover_hosts after that point.
For an automated config management system, this is a lot of complexity
to handle in order to generate a stable output of a working system. In
many cases where you're using ironic to bootstrap another deployment
(i.e. tripleo) the number of nodes may be small (less than the
computes) for quite some time.
There are a couple obvious options I see:
1. Add a --and-services flag to nova-manage, which will also look for
all nova-compute services in the cell and make sure those have
mappings. This is ideal because we could get all services mapped at
config time without even having to have an ironic node in place yet
(which is not possible today). We can't do this efficiently right away
because nova.services does not have a mapped flag, and thus the
scheduler periodic should _not_ include services.
2. We could unset compute_node.mapped any time we re-home an ironic
node to a different nova-compute. This would cause our scheduler
periodic to notice the change and create a host mapping if it happens
to move to an unmapped nova-compute. This generates extra work during
normal operating state and also still leaves us with an interval of
time where a previously-usable ironic node becomes unusable until the
host discovery periodic task runs again.
IMHO, we should do #1. It's a backportable change, and it's actually a
better workflow for config automation tools than what we have today,
even discounting this race. We can do what we did before, which is do
it once for backports, and then add a mapped bit in master to make it
more efficient, allowing it to be included in the scheduler periodic
task.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1755602/+subscriptions
Follow ups