← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1798688] Re: AllocationUpdateFailed_Remote: Failed to update allocations for consumer. Error: another process changed the consumer after the report client read the consumer state during the claim

 

Reviewed:  https://review.openstack.org/623596
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6369f39244558b147f7b0796269d9a86ce9b12d8
Submitter: Zuul
Branch:    master

commit 6369f39244558b147f7b0796269d9a86ce9b12d8
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Fri Dec 7 17:27:16 2018 -0500

    Remove allocations before setting vm_status to SHELVED_OFFLOADED
    
    Tempest is intermittently failing a test which does the
    following:
    
    1. Create a server.
    2. Shelve offload it.
    3. Unshelve it.
    
    Tempest waits for the server status to be SHELVED_OFFLOADED
    before unshelving the server, which goes through the
    scheduler to pick a compute node and claim resources on it.
    
    When shelve offloading a server, the resource allocations
    for the instance and compute node it was on are cleared, which
    will also delete the internal consumer record in the placement
    service.
    
    The race is that the allocations are removed during shelve
    offload *after* the server status changes to SHELVED_OFFLOADED.
    This leaves a window where unshelve is going through the
    scheduler and gets the existing allocations for the instance,
    which are non-empty and have a consumer generation. The
    claim_resources method in the scheduler then uses that
    consumer generation when PUTing the allocations. That PUT
    fails because in between the GET and PUT of the allocations,
    placement has deleted the internal consumer record. When
    PUTing the new allocations with a non-null consumer generation,
    placement returns a 409 conflict error because for a new
    consumer it expects the "consumer_generation" parameter to be
    None.
    
    This change handles the race by simply making sure the allocations
    are deleted (along with the related consumer record in placement)
    *before* the instance.vm_status is changed.
    
    Change-Id: I2a6ccaff904c1f0759d55feeeef0ec1da32c65df
    Closes-Bug: #1798688


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1798688

Title:
  AllocationUpdateFailed_Remote: Failed to update allocations for
  consumer. Error: another process changed the consumer after the report
  client read the consumer state during the claim

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  * Job: neutron-tempest-iptables_hybrid
  * Failed test: tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_shelve_shelved_server
  * Example of failure: http://logs.openstack.org/80/610280/8/check/neutron-tempest-iptables_hybrid/caa373a/job-output.txt.gz

  Details: (ServersNegativeTestJSON:tearDown) Server 7e7cf40f-0ab7-4f22
  -91ce-6f4e22a54ac2 failed to reach ACTIVE status and task state "None"
  within the required time (196 s). Current status: SHELVED_OFFLOADED.
  Current task state: None.

  * Logstash:
  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22failed%20to%20reach%20ACTIVE%20status%20and%20task%20state%20%5C%5C%5C%22None%5C%5C%5C%22%20within%20the%20required%20time%5C%22

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1798688/+subscriptions


References