← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1815799] Re: Attempting to delete a baremetal server places server into ERROR state

 

Reviewed:  https://review.opendev.org/659809
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=060e42b7fd185bf3b827b7aa5c9ab6131f6a21d7
Submitter: Zuul
Branch:    master

commit 060e42b7fd185bf3b827b7aa5c9ab6131f6a21d7
Author: Mark Goddard <mark@xxxxxxxxxxxx>
Date:   Fri May 17 14:58:59 2019 +0100

    [ironic] Don't remove instance info twice in destroy
    
    During teardown, at the end of the ironic virt driver's destroy method,
    we call _cleanup_deploy, which since
    https://review.opendev.org/#/c/563722/ removes instance_info from the
    ironic node. Given that we're destroying the node, the instance_info
    will have been cleared from the node anyway, so we don't need to do
    this.
    
    Further, it can cause tear down to fail if automated cleaning is
    enabled in ironic. This happens because ironic prevents updates to nodes
    that are undergoing a state transition, as is the case during cleaning.
    The virt driver will retry this request by default every 2 seconds with
    60 attempts. Typically this is not long enough for cleaning to complete,
    so tear down fails with the following error:
    
      Conflict: Node a00696d5-32ba-475e-9528-59bf11cffea6 can not be updated
      while a state transition is in progress. (HTTP 409)
    
    This change skips the instance info update in _cleanup_deploy in the
    case of tear down.
    
    Change-Id: Iea337f73c231db2cb9d9f639b92475daaede6793
    Closes-Bug: #1815799


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1815799

Title:
  Attempting to delete a baremetal server places server into ERROR state

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  When deleting a baremetal server, we see:

    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a] Traceback (most recent call last):
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2698, in do_terminate_instance
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     self._delete_instance(context, instance, bdms)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 154, in inner
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     rv = f(*args, **kwargs)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2635, in _delete_instance
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     self._shutdown_instance(context, instance, bdms)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2527, in _shutdown_instance
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     requested_networks)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     self.force_reraise()
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     six.reraise(self.type_, self.value, self.tb)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2514, in _shutdown_instance
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     block_device_info)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1245, in destroy
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     self._cleanup_deploy(node, instance, network_info)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 461, in _cleanup_deploy
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     self._remove_instance_info_from_node(node, instance)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 396, in _remove_instance_info_from_node
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     self.ironicclient.call('node.update', node.uuid, patch)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/client_wrapper.py", line 170, in call
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     return self._multi_getattr(client, method)(*args, **kwargs)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 360, in update
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     params=params)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/ironicclient/common/base.py", line 232, in _update
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     resp, body = self.api.json_request(method, url, **kwargs)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 678, in json_request
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     resp = self._http_request(url, method, **kwargs)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 287, in wrapper
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     return func(self, url, method, **kwargs)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]   File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 660, in _http_request
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a]     error_json.get('debuginfo'), method, url)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a] Conflict: Node a00696d5-32ba-475e-9528-59bf11cffea6 can not be updated while a state transition is in progress. (HTTP 409)
    2019-02-13 12:58:16.856 7 ERROR nova.compute.manager [instance: dcb4f055-cda4-4d61-ba8f-976645c4e92a] 

  This places the server into an ERROR state:

    $ openstack server list
    +--------------------------------------+------------+--------+----------------------------+---------+------------+
    | ID                                   | Name       | Status | Networks                   | Image   | Flavor     |
    +--------------------------------------+------------+--------+----------------------------+---------+------------+
    | dcb4f055-cda4-4d61-ba8f-976645c4e92a | larstest2  | ERROR  |                            |         | baremetal  |
    +--------------------------------------+------------+--------+----------------------------+---------+------------+

  A subsequent "openstack server delete" will successfully delete the
  server.

  On the ironic side, we see the following message logged several times
  before Nova reports the conflict:

    2019-02-13 12:58:15.138 21 DEBUG wsme.api [req-b6f0a1a3-e2cc-
  4b13-a979-3676757b0972 7a9b1ac45e084e7cbeb9cb740ffe8d08
  41ea8af8d00e46438c7be3b182bbb53f - default default] Client-side error:
  Node a00696d5-32ba-475e-9528-59bf11cffea6 can not be updated while a
  state transition is in progress. format_exception /usr/lib/python2.7
  /site-packages/wsme/api.py:222

  This is with nova @ ad842aa and ironic @ 4404292.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1815799/+subscriptions


References