← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1852458] Re: "create" instance action not created when instance is buried in cell0

 

Reviewed:  https://review.opendev.org/694165
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f2608c91175411ec7c2604035adb39306d7e607e
Submitter: Zuul
Branch:    master

commit f2608c91175411ec7c2604035adb39306d7e607e
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Wed Nov 13 15:03:27 2019 -0500

    Create instance action when burying in cell0
    
    Change I8742071b55f018f864f5a382de20075a5b444a79 in Ocata
    moved the creation of the instance record from the API to
    conductor. As a result, the "create" instance action was
    only being created in conductor when the instance is created
    in a non-cell0 database. This is a regression because before
    that change when a server create would fail during scheduling
    you could still list instance actions for the server and see
    the "create" action but that was lost once we started burying
    those instances in cell0.
    
    This fixes the bug by creating the "create" action in the cell0
    database when burying an instance there. It goes a step further
    and also creates and finishes an event so the overall action
    message shows up as "Error" with the details about where the
    failure occurred in the event traceback.
    
    A short release note is added since a new action event is
    added here (conductor_schedule_and_build_instances) rather than
    re-use some kind of event that we could generate from the
    compute service, e.g. compute__do_build_and_run_instance.
    
    Change-Id: I1e9431e739adfbcfc1ca34b87e826a516a4b18e2
    Closes-Bug: #1852458


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1852458

Title:
  "create" instance action not created when instance is buried in cell0

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Triaged
Status in OpenStack Compute (nova) pike series:
  Triaged
Status in OpenStack Compute (nova) queens series:
  Triaged
Status in OpenStack Compute (nova) rocky series:
  Triaged
Status in OpenStack Compute (nova) stein series:
  Triaged
Status in OpenStack Compute (nova) train series:
  Triaged

Bug description:
  Before cell0 was introduced the API would create the "create" instance
  action for each instance in the nova cell database before casting off
  to conductor to do scheduling:

  https://github.com/openstack/nova/blob/mitaka-
  eol/nova/compute/api.py#L1180

  Note that conductor failed to "complete" the action with a failure
  event:

  https://github.com/openstack/nova/blob/mitaka-
  eol/nova/conductor/manager.py#L374

  But at least the action was created.

  Since then, with cell0, if scheduling fails the instance is buried in
  the cell0 database but no instance action is created. To illustrate, I
  disabled the single nova-compute service on my devstack host and
  created a server which failed with NoValidHost:

  $ openstack server show build-fail1 -f value -c fault
  {u'message': u'No valid host was found. ', u'code': 500, u'created': u'2019-11-13T15:57:13Z'}

  When listing instance actions I expected to see a "create" action but
  there were none:

  $ nova instance-action-list 008a7d52-dd83-4f52-a720-b3cfcc498259
  +--------+------------+---------+------------+------------+
  | Action | Request_ID | Message | Start_Time | Updated_At |
  +--------+------------+---------+------------+------------+
  +--------+------------+---------+------------+------------+

  This is because the "create" action is only created when the instance
  is scheduled to a specific cell:

  https://github.com/openstack/nova/blob/20.0.0/nova/conductor/manager.py#L1460

  Solution:

  The ComputeTaskManager._bury_in_cell0 method should also create a
  "create" action in cell0 like it does for the instance BDMs and tags.

  This goes back to Ocata: https://review.opendev.org/#/c/319379/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1852458/+subscriptions


References