yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1599256] [NEW] instance_get_all_by_filters can perform unnecessary joins

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Andrew Garner <1599256@xxxxxxxxxxxxxxxxxx>
Date: Tue, 05 Jul 2016 18:20:52 -0000
Reply-to: Bug 1599256 <1599256@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

When listing server details, instance_get_all_by_filters() can be
invoked with duplicates in the columns_to_join list.  This can result in
both a join and a separate query against the potential "manual_join"
tables.

This appears to have been introduced with the behavior in this commit:

https://github.com/openstack/nova/commit/2e68b2298e94a15d1282c0fb46804b9efa6c8b3a

Where the list of expected_attr here:

https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/views/servers.py#L55
https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/views/servers.py#L110

is further extended with the expected_attr list in the compute_api here:

https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2092

Resulting in a columns_to_join list resembling:

['flavor', 'info_cache', 'metadata', 'metadata', 'system_metadata',
'info_cache', 'security_groups']

In nova.db.sqlalchemy.api:_manual_join_columns(), only the first
'metadata' entry gets removed resulting in both a sqlalchemy
joinedload() hint (joining against instance_metadata) and separately
querying instance_metadata via _instances_fill_metadata().  The
'metadata' join in particular can be rather inefficient.  In some cases
this results in about 10x the data being pulled from the database
compared to just the "manual join" - this problem is particularly
amplified for projects with a large number of associated instances.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1599256

Title:
  instance_get_all_by_filters can perform unnecessary joins

Status in OpenStack Compute (nova):
  New

Bug description:
  When listing server details, instance_get_all_by_filters() can be
  invoked with duplicates in the columns_to_join list.  This can result
  in both a join and a separate query against the potential
  "manual_join" tables.

  This appears to have been introduced with the behavior in this commit:

  https://github.com/openstack/nova/commit/2e68b2298e94a15d1282c0fb46804b9efa6c8b3a

  Where the list of expected_attr here:

  https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/views/servers.py#L55
  https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/views/servers.py#L110

  is further extended with the expected_attr list in the compute_api
  here:

  https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2092

  Resulting in a columns_to_join list resembling:

  ['flavor', 'info_cache', 'metadata', 'metadata', 'system_metadata',
  'info_cache', 'security_groups']

  In nova.db.sqlalchemy.api:_manual_join_columns(), only the first
  'metadata' entry gets removed resulting in both a sqlalchemy
  joinedload() hint (joining against instance_metadata) and separately
  querying instance_metadata via _instances_fill_metadata().  The
  'metadata' join in particular can be rather inefficient.  In some
  cases  this results in about 10x the data being pulled from the
  database compared to just the "manual join" - this problem is
  particularly amplified for projects with a large number of associated
  instances.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1599256/+subscriptions

Follow ups

[Bug 1599256] Re: instance_get_all_by_filters can perform unnecessary joins
From: OpenStack Infra, 2016-10-03