← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1972023] [NEW] Failed (but retryable) device detaches are logged as ERROR

 

Public bug reported:

At the moment, if a device attempts to be detached and times out (using
libvirt), it will log a message:

https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2570-L2573

However, this is not a failure, since we actually retry the process a
few more times depending on configuration, and then if it is a full
failure, we do report that:

https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2504

In high load environments where this timeout might be hit, this triggers
"ERROR" messages that might seem problematic to the operator, however,
since the follow up attempt succeeds, there's no need for attention.
This message should be logged as a WARNING since the operator will only
need to intervene if the ERROR is logged and it is a full failure of
detaching the device.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1972023

Title:
  Failed (but retryable) device detaches are logged as ERROR

Status in OpenStack Compute (nova):
  New

Bug description:
  At the moment, if a device attempts to be detached and times out
  (using libvirt), it will log a message:

  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2570-L2573

  However, this is not a failure, since we actually retry the process a
  few more times depending on configuration, and then if it is a full
  failure, we do report that:

  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2504

  In high load environments where this timeout might be hit, this
  triggers "ERROR" messages that might seem problematic to the operator,
  however, since the follow up attempt succeeds, there's no need for
  attention.  This message should be logged as a WARNING since the
  operator will only need to intervene if the ERROR is logged and it is
  a full failure of detaching the device.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1972023/+subscriptions



Follow ups