yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1750890] Re: Neutron db performance at scale

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Ihar Hrachyshka <1750890@xxxxxxxxxxxxxxxxxx>
Date: Thu, 15 Mar 2018 18:40:46 -0000
Reply-to: Bug 1750890 <1750890@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Yes, since we landed a bunch of patches in Ocata+ that should fix
performance in several scenarios, incl. switch to subquery type for
model attributes, I advise you check if scalability issues affect your
fresh setup. If so, please provide more details (charts, measurements),
and we will take a closer look.

Marking the bug as fixed for the time being. (An alternative would be
'Incomplete', but we had related fixes merged in Ocata+). As for older
releases, we no longer support Newton, so we can't provide backports
ourselves.

** Changed in: neutron
Status: New => Fix Released

** Changed in: neutron
Importance: Undecided => High

** Tags added: db

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1750890

Title:
Neutron db performance at scale

Status in neutron:
Fix Released

Bug description:
OpenStack Neutron (like OpenStack) relies on SQL Alcehmy and its ORM
for database support. From our observations, Neutron is not utilizing
the ORM models directly, but rather inserting an additional model
layer above SQLAlchmeny and manually building these models from a
number of underlying DB models. We ran into significant performance
issues due to the increased number of queries at large scale. <Scale
numbers to be added here in the future.>

For ports the problem starts here
https://github.com/openstack/neutron/blob/master/neutron/db/db_base_plugin_common.py#L202-L219.
The base dict is built from a single DB query row and then the
processing of all extensions (which is the default behaviour) leads to
a sequential series of additional queries per row to augment the dict.
In our opinion, this causes issues from a performance perspective, it
leads to the classic n+1 query anti-pattern and fundamentally does not
scale (an alternate option would be to do a “joined” query with active
extensions). This illustrates the type of workarounds that result
from this approach
https://github.com/openstack/neutron/blob/master/neutron/db/_utils.py#L95-L107.
Instead of using native SQL to filter fields from the result the whole
result reset has to be iterated to filter out fields, again surely
this is an anti-pattern when processing DB objects.

With respect to LBaaS support, we removed the intermediate model layer
with this (and a couple of previous) commit(s)
https://github.com/sapcc/neutron-
lbaas/commit/f71867fbf6c8a27df43aaff6046948dce60f3081. This is just
an interim change but after implementing this we saw LBAAS API
requests going from > 1-5 minutes and degrading with # of objects to a
consistent sub second response time.

Version:
This is/should be present in all versions, but our testing has been done in Mitaka and above.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1750890/+subscriptions

References

[Bug 1750890] [NEW] Neutron db performance at scale
From: Leon Zachery, 2018-02-21