← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1893436] [NEW] Error updating resources on new hypervisor with reserve_disk_resource_for_image_cache enabled

 

Public bug reported:

Description
===========

It is impossible to launch instances on freshly deployed hypervisors
with the following configuration:

[workarounds]
reserve_disk_resource_for_image_cache = True

nova-compute logs shows failures to update resources caused by the
/var/lib/nova/instances/_base folder being missing.

A workaround is to:

* disable this option
* launch an instance
* enable the option again

I assume this is because _base is lazily created on the first instance
launch.

Steps to reproduce
==================

* Deploy a new Nova hypervisor with [workarounds]/reserve_disk_resource_for_image_cache = True
* Start the nova-compute service
* Try launching instances on the hypervisor

Expected result
===============

I would except instances to launch successfully.

Actual result
=============

* Instances fail to launch on new hypervisors
* Error and stack trace is shown every minute in logs

Environment
===========

OpenStack Train / Kolla binary images using the following RPMs:

openstack-nova-compute-20.3.0-1.el8.noarch
openstack-nova-common-20.3.0-1.el8.noarch
python3-nova-20.3.0-1.el8.noarch
python3-novaclient-15.1.1-1.el8.noarch

Logs & Configs
==============

2020-08-28 14:58:39.649 6 ERROR nova.compute.manager [req-34bb1ad0-0ffb-4215-983c-487349a37f58 - - - - -] Error updating resources for node node-foo.: FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/nova/instances/_base'
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager Traceback (most recent call last):
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8740, in _update_available_resource_for_node
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     startup=startup)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 887, in update_available_resource
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     return f(*args, **kwargs)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 972, in _update_available_resource
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._update(context, cn, startup=startup)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1237, in _update
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._update_to_placement(context, compute_node, startup)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 68, in wrapped_f
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     return Retrying(*dargs, **dkw).call(f, *args, **kw)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 223, in call
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     return attempt.get(self._wrap_exception)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 261, in get
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     six.reraise(self.value[0], self.value[1], self.value[2])
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     raise value
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 217, in call
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1157, in _update_to_placement
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self.driver.update_provider_tree(prov_tree, nodename)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 7305, in update_provider_tree
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._get_disk_size_reserved_for_image_cache()),
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 9672, in _get_disk_size_reserved_for_image_cache
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self.image_cache_manager.get_disk_usage() / 1024.0 / 1024.0)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/imagecache.py", line 358, in get_disk_usage
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     if not self.cache_dir_is_on_same_dev_as_instances_dir:
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/imagecache.py", line 380, in cache_dir_is_on_same_dev_as_instances_dir
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     os.stat(self.cache_dir).st_dev)
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/nova/instances/_base'
2020-08-28 14:58:39.649 6 ERROR nova.compute.manager

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1893436

Title:
  Error updating resources on new hypervisor with
  reserve_disk_resource_for_image_cache enabled

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  It is impossible to launch instances on freshly deployed hypervisors
  with the following configuration:

  [workarounds]
  reserve_disk_resource_for_image_cache = True

  nova-compute logs shows failures to update resources caused by the
  /var/lib/nova/instances/_base folder being missing.

  A workaround is to:

  * disable this option
  * launch an instance
  * enable the option again

  I assume this is because _base is lazily created on the first instance
  launch.

  Steps to reproduce
  ==================

  * Deploy a new Nova hypervisor with [workarounds]/reserve_disk_resource_for_image_cache = True
  * Start the nova-compute service
  * Try launching instances on the hypervisor

  Expected result
  ===============

  I would except instances to launch successfully.

  Actual result
  =============

  * Instances fail to launch on new hypervisors
  * Error and stack trace is shown every minute in logs

  Environment
  ===========

  OpenStack Train / Kolla binary images using the following RPMs:

  openstack-nova-compute-20.3.0-1.el8.noarch
  openstack-nova-common-20.3.0-1.el8.noarch
  python3-nova-20.3.0-1.el8.noarch
  python3-novaclient-15.1.1-1.el8.noarch

  Logs & Configs
  ==============

  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager [req-34bb1ad0-0ffb-4215-983c-487349a37f58 - - - - -] Error updating resources for node node-foo.: FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/nova/instances/_base'
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager Traceback (most recent call last):
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8740, in _update_available_resource_for_node
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     startup=startup)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 887, in update_available_resource
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     return f(*args, **kwargs)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 972, in _update_available_resource
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._update(context, cn, startup=startup)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1237, in _update
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._update_to_placement(context, compute_node, startup)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 68, in wrapped_f
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     return Retrying(*dargs, **dkw).call(f, *args, **kw)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 223, in call
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     return attempt.get(self._wrap_exception)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 261, in get
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     six.reraise(self.value[0], self.value[1], self.value[2])
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     raise value
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/retrying.py", line 217, in call
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1157, in _update_to_placement
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self.driver.update_provider_tree(prov_tree, nodename)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 7305, in update_provider_tree
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self._get_disk_size_reserved_for_image_cache()),
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 9672, in _get_disk_size_reserved_for_image_cache
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     self.image_cache_manager.get_disk_usage() / 1024.0 / 1024.0)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/imagecache.py", line 358, in get_disk_usage
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     if not self.cache_dir_is_on_same_dev_as_instances_dir:
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/imagecache.py", line 380, in cache_dir_is_on_same_dev_as_instances_dir
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager     os.stat(self.cache_dir).st_dev)
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/nova/instances/_base'
  2020-08-28 14:58:39.649 6 ERROR nova.compute.manager

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1893436/+subscriptions


Follow ups