← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1650512] [NEW] Slow net-list command, over 15K returned records by SQL

 

Public bug reported:

OpenStack Release: Mitaka

We have:

230 networks
102 entries in network RBAC

When I'm listing instances from Horizon (takes ages), neutron runs this SQL query: http://paste.openstack.org/show/592603/ and it gets over 15k records from it.
For almost all networks there are two records but for four networks there are more:

     4 200d2ee9-fcac-4224-9005-5bcff43944c9   - 1 entry in network RBAC
     4 a36912c4-7202-4074-b10a-3f6af7514498   - no entry in network RBAC
   144 ba9d80b7-8593-4214-be1a-731ea7c92e56   - 12 entries in network RBAC
 14792 7493c1b5-6954-4f71-8145-6c95694a9ba6   - 86 entries in network RBAC

So it is clear that number of rows is correlated with network RBAC:
(network RBAC entries)^2

When neutron-server process receives those 15k records it consumes 100%
of one CPU core (2.6GHz) for 1-2 seconds. The result is that network-
list command takes over 2sec.

I found two LEFT JOINs in the SQL query:

LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets_1.network_id = networkrbacs_1.object_id
LEFT OUTER JOIN networkrbacs AS networkrbacs_2 ON networks.id = networkrbacs_2.object_id

I think this is the reason of ^2 correlation. The meaning of the
conditions in both LEFT JOIN are the same.

I'm not sure if I read the code correctly but I see rbac_entries in both Subnet and Network models here:
https://git.openstack.org/cgit/openstack/neutron/tree/neutron/db/models_v2.py

Please find a way how to remove (network RBAC entries)^2 correlation because it is dangerous and have influence on many Horizon operations, for sure on those:
- networks list
- instances list
- "Launch instance" form creation

** Affects: neutron
     Importance: Undecided
         Status: New

** Description changed:

  OpenStack Release: Mitaka
  
- I have:
+ We have:
  
  230 networks
  102 entries in network RBAC
  
  When I'm listing instances from Horizon (takes ages), neutron runs this SQL query: http://paste.openstack.org/show/592603/ and it gets over 15k records from it.
  For almost all networks there are two records but for four networks there are more:
  
-      4 200d2ee9-fcac-4224-9005-5bcff43944c9   - 1 entry in network RBAC
-      4 a36912c4-7202-4074-b10a-3f6af7514498   - no entry in network RBAC
-    144 ba9d80b7-8593-4214-be1a-731ea7c92e56   - 12 entries in network RBAC
-  14792 7493c1b5-6954-4f71-8145-6c95694a9ba6   - 86 entries in network RBAC
+      4 200d2ee9-fcac-4224-9005-5bcff43944c9   - 1 entry in network RBAC
+      4 a36912c4-7202-4074-b10a-3f6af7514498   - no entry in network RBAC
+    144 ba9d80b7-8593-4214-be1a-731ea7c92e56   - 12 entries in network RBAC
+  14792 7493c1b5-6954-4f71-8145-6c95694a9ba6   - 86 entries in network RBAC
  
  So it is clear that number of rows is correlated with network RBAC:
  (network RBAC entries)^2
  
  When neutron-server process receives those 15k records it consumes 100%
  of one CPU core (2.6GHz) for 1-2 seconds. The result is that network-
  list command takes over 2sec.
  
  I found two LEFT JOINs in the SQL query:
  
  LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets_1.network_id = networkrbacs_1.object_id
  LEFT OUTER JOIN networkrbacs AS networkrbacs_2 ON networks.id = networkrbacs_2.object_id
  
  I think this is the reason of ^2 correlation. Meaning of conditions in
  both LEFT JOIN are the same.
  
  I'm not sure if I read the code correctly but I see rbac_entries in both Subnet and Network models here:
  https://git.openstack.org/cgit/openstack/neutron/tree/neutron/db/models_v2.py
  
  Please find a way how to remove (network RBAC entries)^2 correlation because it is dangerous and have influence on many Horizon operations, for sure on those:
  - networks list
  - instances list
  - "Launch instance" form creation

** Description changed:

  OpenStack Release: Mitaka
  
  We have:
  
  230 networks
  102 entries in network RBAC
  
  When I'm listing instances from Horizon (takes ages), neutron runs this SQL query: http://paste.openstack.org/show/592603/ and it gets over 15k records from it.
  For almost all networks there are two records but for four networks there are more:
  
       4 200d2ee9-fcac-4224-9005-5bcff43944c9   - 1 entry in network RBAC
       4 a36912c4-7202-4074-b10a-3f6af7514498   - no entry in network RBAC
     144 ba9d80b7-8593-4214-be1a-731ea7c92e56   - 12 entries in network RBAC
   14792 7493c1b5-6954-4f71-8145-6c95694a9ba6   - 86 entries in network RBAC
  
  So it is clear that number of rows is correlated with network RBAC:
  (network RBAC entries)^2
  
  When neutron-server process receives those 15k records it consumes 100%
  of one CPU core (2.6GHz) for 1-2 seconds. The result is that network-
  list command takes over 2sec.
  
  I found two LEFT JOINs in the SQL query:
  
  LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets_1.network_id = networkrbacs_1.object_id
  LEFT OUTER JOIN networkrbacs AS networkrbacs_2 ON networks.id = networkrbacs_2.object_id
  
- I think this is the reason of ^2 correlation. Meaning of conditions in
- both LEFT JOIN are the same.
+ I think this is the reason of ^2 correlation. The meaning of the
+ conditions in both LEFT JOIN are the same.
  
  I'm not sure if I read the code correctly but I see rbac_entries in both Subnet and Network models here:
  https://git.openstack.org/cgit/openstack/neutron/tree/neutron/db/models_v2.py
  
  Please find a way how to remove (network RBAC entries)^2 correlation because it is dangerous and have influence on many Horizon operations, for sure on those:
  - networks list
  - instances list
  - "Launch instance" form creation

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1650512

Title:
  Slow net-list command, over 15K returned records by SQL

Status in neutron:
  New

Bug description:
  OpenStack Release: Mitaka

  We have:

  230 networks
  102 entries in network RBAC

  When I'm listing instances from Horizon (takes ages), neutron runs this SQL query: http://paste.openstack.org/show/592603/ and it gets over 15k records from it.
  For almost all networks there are two records but for four networks there are more:

       4 200d2ee9-fcac-4224-9005-5bcff43944c9   - 1 entry in network RBAC
       4 a36912c4-7202-4074-b10a-3f6af7514498   - no entry in network RBAC
     144 ba9d80b7-8593-4214-be1a-731ea7c92e56   - 12 entries in network RBAC
   14792 7493c1b5-6954-4f71-8145-6c95694a9ba6   - 86 entries in network RBAC

  So it is clear that number of rows is correlated with network RBAC:
  (network RBAC entries)^2

  When neutron-server process receives those 15k records it consumes
  100% of one CPU core (2.6GHz) for 1-2 seconds. The result is that
  network-list command takes over 2sec.

  I found two LEFT JOINs in the SQL query:

  LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets_1.network_id = networkrbacs_1.object_id
  LEFT OUTER JOIN networkrbacs AS networkrbacs_2 ON networks.id = networkrbacs_2.object_id

  I think this is the reason of ^2 correlation. The meaning of the
  conditions in both LEFT JOIN are the same.

  I'm not sure if I read the code correctly but I see rbac_entries in both Subnet and Network models here:
  https://git.openstack.org/cgit/openstack/neutron/tree/neutron/db/models_v2.py

  Please find a way how to remove (network RBAC entries)^2 correlation because it is dangerous and have influence on many Horizon operations, for sure on those:
  - networks list
  - instances list
  - "Launch instance" form creation

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1650512/+subscriptions