yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79057
[Bug 1834349] Re: Error updating resources for node ubuntu-bionic-ovh-bhs1-0008373888.: AttributeError: 'NoneType' object has no attribute 'flavorid' - race with resize confirm
** Changed in: nova
Importance: Undecided => Medium
** Also affects: nova/stein
Importance: Undecided
Status: New
** Changed in: nova/stein
Status: New => Confirmed
** Changed in: nova/stein
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1834349
Title:
Error updating resources for node ubuntu-bionic-ovh-bhs1-0008373888.:
AttributeError: 'NoneType' object has no attribute 'flavorid' - race
with resize confirm
Status in OpenStack Compute (nova):
In Progress
Status in OpenStack Compute (nova) stein series:
Confirmed
Bug description:
Seeing this in CI jobs:
http://logs.openstack.org/45/666845/1/check/neutron-tempest-dvr-ha-
multinode-
full/5b09053/controller/logs/screen-n-cpu.txt#_Jun_26_13_51_48_973568
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager [None req-903b3e73-3ce3-4f5c-9a30-811383077679 None None] Error updating resources for node ubuntu-bionic-ovh-bhs1-0008373888.: AttributeError: 'NoneType' object has no attribute 'flavorid'
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager Traceback (most recent call last):
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/manager.py", line 8101, in _update_available_resource_for_node
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager startup=startup)
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 735, in update_available_resource
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup)
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/oslo_concurrency/lockutils.py", line 328, in inner
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager return f(*args, **kwargs)
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 783, in _update_available_resource
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager self._update_usage_from_migrations(context, migrations, nodename)
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1192, in _update_usage_from_migrations
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager nodename)
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1118, in _update_usage_from_migration
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager migration.uuid, itype.flavorid, instance=instance)
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager AttributeError: 'NoneType' object has no attribute 'flavorid'
Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: ERROR nova.compute.manager
I'm not sure if this is a race, but it's this code:
https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/compute/resource_tracker.py#L1118
which was introduced with this change (of mine) in Stein:
https://github.com/openstack/nova/commit/20a46ece5701c9798a5e0df12c944237cb1ece3e
According to logstash this is only showing up in multinode jobs:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22AttributeError%3A%20'NoneType'%20object%20has%20no%20attribute%20'flavorid'%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d
Given this code to get the flavor:
https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/compute/resource_tracker.py#L1439
def _get_instance_type(self, instance, prefix, migration):
"""Get the instance type from instance."""
stashed_flavors = migration.migration_type in ('resize',)
if stashed_flavors:
return getattr(instance, '%sflavor' % prefix)
else:
# NOTE(ndipanov): Certain migration types (all but resize)
# do not change flavors so there is no need to stash
# them. In that case - just get the instance flavor.
return instance.flavor
And this in the logs to load the instance.old_flavor:
Jun 26 13:51:48.911052 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: INFO nova.compute.resource_tracker [None req-903b3e73-3ce3-4f5c-9a30-811383077679 None None] [instance: 4342d535-d00f-4ebc-930a-8e634f343eef] Updating resource usage from migration 87280d26-3511-47cb-9a69-6d510fd92718
Jun 26 13:51:48.911397 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: DEBUG nova.objects.instance [None req-903b3e73-3ce3-4f5c-9a30-811383077679 None None] Lazy-loading 'old_flavor' on Instance uuid 4342d535-d00f-4ebc-930a-8e634f343eef {{(pid=27912) obj_load_attr /opt/stack/nova/nova/objects/instance.py:1053}}
This is coming from a resize operation. I also see this:
Jun 26 13:51:48.844575 ubuntu-bionic-ovh-bhs1-0008373888 nova-
compute[27912]: DEBUG nova.compute.resource_tracker [None req-
903b3e73-3ce3-4f5c-9a30-811383077679 None None] Migration for instance
4342d535-d00f-4ebc-930a-8e634f343eef refers to another host's
instance! {{(pid=27912) _pair_instances_to_migrations
/opt/stack/nova/nova/compute/resource_tracker.py:751}}
Which was added because of https://review.opendev.org/#/c/382195/ so
we might just be racing with the confirm/revert during the
update_available_resource task and the resize was confirmed/reverted.
Further up in the logs I also see this, confirming the instance is
being resized:
Jun 26 13:51:48.559335 ubuntu-bionic-ovh-bhs1-0008373888 nova-
compute[27912]: INFO nova.virt.libvirt.driver [None req-903b3e73-3ce3
-4f5c-9a30-811383077679 None None] [instance: 4342d535-d00f-4ebc-930a-
8e634f343eef] Periodic task is updating the host stats; it is trying
to get disk info for instance-0000006c, but the backing disk was
removed by a concurrent operation (task_state=None) and
(vm_state=resized)
When confirming a resize (which tempest will do as soon as the server
status goes to VERIFY_RESIZE in most tests), the instance.old_flavor
is nulled out here:
https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/compute/manager.py#L4031
And then the move claim is dropped on the source compute:
https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/compute/manager.py#L4057
and that's where we remove it from tracked_migrations in the RT:
https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/compute/resource_tracker.py#L464
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1834349/+subscriptions
References