yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70926
[Bug 1747622] Re: Aggregate info in nova_scheduler lose some host when add some host in aggregate continuousely
What do you mean by "adding hosts continuously" ? You mean issuing those commands at once ?
Looks to me that https://github.com/openstack/nova/blob/0258cecaca88d4a305e99c5a17e2230361ef1235/nova/compute/api.py#L5050-L5062 could be racy if we have multiple API workers that fetch simultaneously the aggregate information and try to update it.
We could make that more resilient and adding more distributed locking
mechanism, but since the aggregates API is admin-only (and adding a host
is something not done often - in comparison to an end-user API call for
example), I leave the question open whether the solution complications
would overcome the benefits.
** Tags added: sched
** Tags removed: sched
** Tags added: availability-zones openstack-version.pike
** Changed in: nova
Status: New => Opinion
** Changed in: nova
Importance: Undecided => Low
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1747622
Title:
Aggregate info in nova_scheduler lose some host when add some host in
aggregate continuousely
Status in OpenStack Compute (nova):
Opinion
Bug description:
Description
===========
If add some host to an availability_zone continuously, nova_scheduler's aggs_by_id and host_aggregates_map may be lost some host aggregate data. Then create instance in this availability_zone will not select those lost host every time.
Steps to reproduce
==================
1.create an availability_zone.
nova aggregate-create test3 test3
2.add host to this availability_zone continuously.
nova aggregate-add-host 51 Computer0102
nova aggregate-add-host 51 Computer0103
nova aggregate-add-host 51 Computer0116
3.create instances in this availability_zone.
Expected result
===============
Instances can select Computer0102, Computer0103 and Computer0116.
Actual result
=============
Instance never select Computer0103.
Environment
===========
1. Exact version of OpenStack you are running. See the following
Pike
2. Which hypervisor did you use?
Libvirt + KVM
Logs & Configs
=============
I add some log in nova-scheduler's host_manager, find aggregate lose information in nova-api when add host continuously.
[root@Controller01 ~]# cat /var/log/nova/nova-scheduler.log | grep hanrong |grep update_aggregates
2018-02-06 11:02:43.412 38000 INFO nova.scheduler.host_manager [req-69cb0a45-96f9-4693-91e8-46aeaec4ff54 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=[],id=51,metadata={},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
2018-02-06 11:02:52.187 38000 INFO nova.scheduler.host_manager [req-b0582d0d-59fd-4a58-85ab-ab13116bec40 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
2018-02-06 11:02:52.239 38000 INFO nova.scheduler.host_manager [req-eae376aa-f725-4b87-8740-df58c0bb25de 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102','Computer0103'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
2018-02-06 11:02:52.247 38000 INFO nova.scheduler.host_manager [req-22a5740f-6560-4603-8904-509b39335a76 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102','Computer0116'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1747622/+subscriptions
References