yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90906
[Bug 1918145] Re: Slownesses on neutron API with many RBAC rules
I think one of the first step that we can have is to remove the ORDER BY
as it creates the temporary filesort that you have mentioned in #9.
I may missing something, an order by UUID does not bring any kind value?
A second step would be to understand why the possible key object_id is
not used.
There is also another point, we can notice that we do filter per action,
but I think that we do not have an index on it, maybe we could also
investigate that point.
** Changed in: neutron
Status: Fix Released => Confirmed
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1918145
Title:
Slownesses on neutron API with many RBAC rules
Status in neutron:
Confirmed
Bug description:
* Summary: Slownesses on neutron API with many RBAC rules
* High level description: Sharing several networks or security groups
to project drastically increase API response time on some routes
(/networks or /server/detail).
For quite some time we have observing that reponse times are
increasing (slowly fur surely) on /networks calls. We have increased
the number of Neutron workers, but in vain.
Lately, we're observing that it's getting worse (reponse time form 5 to 370 seconds). We discarded possible bottlenecks one by one (our service endpoint performance, neutron API configuration, etc).
But we have found that some calls in the DB takes a lot of time. It seems they are stuck in the mariadb database (10.3.10). So we have captured a slow queries in mysql.
An example of for /server/detail:
---------------------------------
http://paste.openstack.org/show/803334/
We can see that there are more than 2 millions of rows examinated, and
around 1657 returned.
An example of for /networks:
----------------------------
http://paste.openstack.org/show/803337/
Rows_sent: 517 Rows_examined: 223519
* Pre-conditions:
Database tables size:
table:
- networkrbacs 16928 rows
- securitygrouprbacs 1691 rows
- keystone.project 1713 rows
Control plane nodes are shared with some others services:
- RMQ
- mariadb
- Openstack APIs
- DHCP agents
It seems the code of those lines are based on
https://github.com/openstack/neutron-
lib/blob/698e4c8daa7d43018a71122ec5b0cd5b17b55141/neutron_lib/db/model_query.py#L120
* Step-by-step reproduction steps:
- Create a lot of projects (at least 1000)
- Create a SG in admin account
- Create fake networks (vlan, vxlan) with associated
- Share the SG and all networks with all projects
* Expected output: lower response time, less than 5 seconds
(approximatively).
* Actual output: May lead to gateway timeout.
* Version:
** OpenStack version Stein releases for all components (neutron 14.2.0).
** CentOS 7.4 with kolla containers
** kolla-ansible for stein release
* Environment: We operate all services in Openstack except for Cinder.
* Perceived severity: Medium
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1918145/+subscriptions
References