← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1747622] Re: Aggregate info in nova_scheduler lose some host when add some host in aggregate continuousely

 

What do you mean by "adding hosts continuously" ? You mean issuing those commands at once ?
Looks to me that https://github.com/openstack/nova/blob/0258cecaca88d4a305e99c5a17e2230361ef1235/nova/compute/api.py#L5050-L5062 could be racy if we have multiple API workers that fetch simultaneously the aggregate information and try to update it.

We could make that more resilient and adding more distributed locking
mechanism, but since the aggregates API is admin-only (and adding a host
is something not done often - in comparison to an end-user API call for
example), I leave the question open whether the solution complications
would overcome the benefits.

** Tags added: sched

** Tags removed: sched
** Tags added: availability-zones openstack-version.pike

** Changed in: nova
       Status: New => Opinion

** Changed in: nova
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1747622

Title:
  Aggregate info in nova_scheduler lose some host when add some host in
  aggregate continuousely

Status in OpenStack Compute (nova):
  Opinion

Bug description:
  Description
  ===========
  If add some host to an availability_zone continuously, nova_scheduler's aggs_by_id and host_aggregates_map may be lost some host aggregate data. Then create instance in this availability_zone will not select those lost host every time.

  Steps to reproduce
  ==================
  1.create an availability_zone.
  nova aggregate-create test3 test3

  2.add host to this availability_zone continuously.
  nova aggregate-add-host 51 Computer0102
  nova aggregate-add-host 51 Computer0103
  nova aggregate-add-host 51 Computer0116

  3.create instances in this availability_zone.

  Expected result
  ===============
  Instances can select Computer0102, Computer0103 and Computer0116.

  Actual result
  =============
  Instance never select Computer0103.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
  Pike

  2. Which hypervisor did you use?
  Libvirt + KVM

  Logs & Configs
  =============
  I add some log in nova-scheduler's host_manager, find aggregate lose information in nova-api when add host continuously.

  [root@Controller01 ~]# cat /var/log/nova/nova-scheduler.log | grep hanrong |grep update_aggregates
  2018-02-06 11:02:43.412 38000 INFO nova.scheduler.host_manager [req-69cb0a45-96f9-4693-91e8-46aeaec4ff54 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=[],id=51,metadata={},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
  2018-02-06 11:02:52.187 38000 INFO nova.scheduler.host_manager [req-b0582d0d-59fd-4a58-85ab-ab13116bec40 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
  2018-02-06 11:02:52.239 38000 INFO nova.scheduler.host_manager [req-eae376aa-f725-4b87-8740-df58c0bb25de 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102','Computer0103'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
  2018-02-06 11:02:52.247 38000 INFO nova.scheduler.host_manager [req-22a5740f-6560-4603-8904-509b39335a76 9974cc9acecb40f3827c3b27e803e87c 04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: [Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102','Computer0116'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1747622/+subscriptions


References