← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1837955] Re: MaxRetriesExceeded sometime fails with messaging exception

 

Reviewed:  https://review.opendev.org/672855
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b98d4ba6d54f5ca2999a8fe6b6d7dfcc134061df
Submitter: Zuul
Branch:    master

commit b98d4ba6d54f5ca2999a8fe6b6d7dfcc134061df
Author: Erik Olof Gunnar Andersson <eandersson@xxxxxxxxxxxx>
Date:   Thu Jul 25 20:19:40 2019 -0700

    Cleanup when hitting MaxRetriesExceeded from no host_available
    
    Prior to this patch there was a condition when no
    host_available was true and an exception would get
    raised without first cleaning up the instance.
    This causes instances to get indefinitely stuck in
    a scheduling state.
    
    This patch fixes this by calling the clean up function
    and then exits build_instances using a return statement.
    
    The related functional regression recreate test is updated
    to show this fixes the bug.
    
    Change-Id: I6a2c63a4c33e783100208fd3f45eb52aad49e3d6
    Closes-bug: #1837955


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1837955

Title:
  MaxRetriesExceeded sometime fails with messaging exception

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  We are occasionally seeing MaxRetriesExceeded causing an "Exception
  during message handling" error. This prevents the database from
  setting the instance into error'd state and causes it to get stuck
  scheduling.

  Example logs:
  WARNING nova.scheduler.client.report [req-] Unable to submit allocation for instance x (409 {"errors": [{"status": 409, "request_id": "req-", "code": "placement.undefined_code", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'DISK_GB' on resource provider 'req-'. The requested amount would exceed the capacity.  ", "title": "Conflict"}]})
  ERROR oslo_messaging.rpc.server [req-] Exception during message handling: MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance x.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1837955/+subscriptions


References