← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1437199] Re: zookeeper driver used with O(n^2) complexity by the scheduler or the api

 

Attilla, please don't file specs kinds of issues as bugs, this is really
a specs level rearchitecture.

** Changed in: nova
       Status: New => Invalid

** Changed in: nova
   Importance: Undecided => Wishlist

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1437199

Title:
  zookeeper driver used with O(n^2) complexity  by the scheduler or the
  api

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  (Loop1) https://github.com/openstack/nova/blob/af2d6c9576b1ac5f3b3768870bb15d9b5cf1610b/nova/scheduler/driver.py#L55
  (Loop2) https://github.com/openstack/nova/blob/af2d6c9576b1ac5f3b3768870bb15d9b5cf1610b/nova/servicegroup/drivers/zk.py#L177

  Iterating the hosts through  the  ComputeFilter also has this issue,
  ComputeFilter usage in a loop has other performance issues .

  The API loop1 is here:
  https://github.com/openstack/nova/blob/e5d0531d8ed4efcd612c0597557e5651c16294b5/nova/api/openstack/compute/contrib/services.py#L81

  The zk driver issue can be mitigated by doing the testing `filtering`
  in the is_up instead of the get_all , by reorganizing the code.

  However better solution would be to have the scheduler to use the get_all,
  or redesigning the servicegroup management.

  A better design would be to use the DB even with the zk,mc drvier, but
  do update ONLY when the service actually came up or dies, in this case
  the sg drivers MAY need dedicated service processes.

  NOTE: The servicegroup driver concept was introduced to avoid doing 10_000 DB update/sec @100_000 host (10/sec  update freq),
  if your servers are bad and every server has 1:1000 chance to die on the given day,  it would lead only to 0.001 UPDATE/sec (100/day) @100_000 host.

  NOTE: If the up/down is knowable just form the DB, the scheduler could
  eliminate the dead hosts at the first DB query, without using
  ComputeFilter as it is used now. (The plugins SHOULD be able to extend
  the  base hosts query)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1437199/+subscriptions


References