yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74458
[Bug 1787977] Re: Inefficient multi-cell instance list
Reviewed: https://review.openstack.org/593131
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c3a77f80b1863e114109af9c32ea01b205c1a735
Submitter: Zuul
Branch: master
commit c3a77f80b1863e114109af9c32ea01b205c1a735
Author: Dan Smith <dansmith@xxxxxxxxxx>
Date: Fri Aug 17 07:56:05 2018 -0700
Make instance_list perform per-cell batching
This makes the instance_list module support batching across cells
with a couple of different strategies, and with room to add more
in the future.
Before this change, an instance list with limit 1000 to a
deployment with 10 cells would generate a query to each cell
database with the same limit. Thus, that API request could end
up processing up to 10,000 instance records despite only
returning 1000 to the user (because of the limit).
This uses the batch functionality in the base code added in
Iaa4759822e70b39bd735104d03d4deec988d35a1
by providing a couple of strategies by which the batch size
per cell can be determined. These should provide a lot of gain
in the short term, and we can extend them with other strategies
as we identify some with additional benefits.
Closes-Bug: #1787977
Change-Id: Ie3a5f5dc49f8d9a4b96f1e97f8a6ea0b5738b768
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1787977
Title:
Inefficient multi-cell instance list
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) queens series:
New
Status in OpenStack Compute (nova) rocky series:
New
Bug description:
This is based on some performance and scale testing done by Huawei,
reported in this dev ML thread:
http://lists.openstack.org/pipermail/openstack-
dev/2018-August/133363.html
In that scenario, they have 10 cells with 10000 instances in each
cell. They then run through a few GET /servers/detail scenarios with
multiple cells and varying limits.
The thread discussion pointed out that they were wasting time pulling
1000 records (the default [api]/max_limit) from all 10 cells and then
throwing away 9000 of those results, so the DB query time per cell was
small, but the sqla/ORM/python was chewing up the time.
Dan Smith has a series of changes here:
https://review.openstack.org/#/q/topic:batched-inst-
list+(status:open+OR+status:merged)
Which allow us to batch the DB queries per cell which, when
distributed across the 10 cells, e.g. 1000 / 10 = 100 batch size per
cell, ends up cutting the time spent in about half (around 11 sec to
around 6 sec).
This is clearly a performance issue which we have a fix, and we
arguably should backport the fix.
Note this is less of an issue for deployments that leverage the
[api]/instance_list_per_project_cells option (like CERN):
https://docs.openstack.org/nova/latest/configuration/config.html#api.instance_list_per_project_cells
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1787977/+subscriptions
References