yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #06920
[Bug 1258625] [NEW] Need some kind of 'auto' boolean column in the Service table
Public bug reported:
Bug 1250049 reported a problem with automatically disabling/enabling a
host via the libvirt driver, but rather than fix it the right way, i.e.
add a new column to the Service table which indicates if an admin
intentionally disabled the host or if nova detected a fail and did it
automatically, a hack was done instead to prefix the 'disabled_reason'
with "AUTO:" and build some logic in the driver around that.
The problem with that approach is the ComputeFilter in the scheduler
can't perform any kind of retry logic around that if needed, i.e. bug
1257644.
Right now if the ComputeFilter encounters a disabled host, it just logs
it at debug level and skips it. If the host was automatically disabled
because of a connection fail, we should at least log that as a warning
in the scheduler (like we do now for hosts that haven't checked in for
awhile) - or possibly build some retry logic around that to make it more
robust in case the connection fail is just a hiccup that quickly
resolves itself.
One could maybe argue that some kind of connection retry logic could be
built into the libvirt driver instead, I wouldn't be against that.
** Affects: nova
Importance: Undecided
Status: New
** Tags: db libvirt scheduler
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1258625
Title:
Need some kind of 'auto' boolean column in the Service table
Status in OpenStack Compute (Nova):
New
Bug description:
Bug 1250049 reported a problem with automatically disabling/enabling a
host via the libvirt driver, but rather than fix it the right way,
i.e. add a new column to the Service table which indicates if an admin
intentionally disabled the host or if nova detected a fail and did it
automatically, a hack was done instead to prefix the 'disabled_reason'
with "AUTO:" and build some logic in the driver around that.
The problem with that approach is the ComputeFilter in the scheduler
can't perform any kind of retry logic around that if needed, i.e. bug
1257644.
Right now if the ComputeFilter encounters a disabled host, it just
logs it at debug level and skips it. If the host was automatically
disabled because of a connection fail, we should at least log that as
a warning in the scheduler (like we do now for hosts that haven't
checked in for awhile) - or possibly build some retry logic around
that to make it more robust in case the connection fail is just a
hiccup that quickly resolves itself.
One could maybe argue that some kind of connection retry logic could
be built into the libvirt driver instead, I wouldn't be against that.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1258625/+subscriptions
Follow ups
References