← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1972023] Re: Failed (but retryable) device detaches are logged as ERROR

 

Reviewed:  https://review.opendev.org/c/openstack/nova/+/840985
Committed: https://opendev.org/openstack/nova/commit/7c87c2f5f744a86d4d854e47848b903ab2674795
Submitter: "Zuul (22348)"
Branch:    master

commit 7c87c2f5f744a86d4d854e47848b903ab2674795
Author: Mohammed Naser <mnaser@xxxxxxxxxxxx>
Date:   Fri May 6 16:27:11 2022 -0400

    Switch libvirt event timeout message to warning
    
    At the moment, if libvirt times out in detaching a device, it
    reports this as an ERROR even if the process will be retried
    and eventually succeed.
    
    We should just log a warning since there's nothing to do, and
    if the process fails after all the retries, it will log an ERROR
    anyways.
    
    Closes-Bug: #1972023
    Change-Id: Idda12db5758706a97b7841571b9ecd3dc6e6905e


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1972023

Title:
  Failed (but retryable) device detaches are logged as ERROR

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  At the moment, if a device attempts to be detached and times out
  (using libvirt), it will log a message:

  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2570-L2573

  However, this is not a failure, since we actually retry the process a
  few more times depending on configuration, and then if it is a full
  failure, we do report that:

  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2504

  In high load environments where this timeout might be hit, this
  triggers "ERROR" messages that might seem problematic to the operator,
  however, since the follow up attempt succeeds, there's no need for
  attention.  This message should be logged as a WARNING since the
  operator will only need to intervene if the ERROR is logged and it is
  a full failure of detaching the device.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1972023/+subscriptions



References