yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #84316
[Bug 1799298] Re: Metadata API cross joining instance_metadata and instance_system_metadata
Reviewed: https://review.opendev.org/758928
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e728fe668a6de886455f2dbaf655c8a151462c8c
Submitter: Zuul
Branch: master
commit e728fe668a6de886455f2dbaf655c8a151462c8c
Author: melanie witt <melwittt@xxxxxxxxx>
Date: Tue Oct 20 21:46:13 2020 +0000
Use subqueryload() instead of joinedload() for (system_)metadata
Currently, when we "get" a single instance from the database and we
load metadata and system_metadata, we do so using a joinedload() which
does JOINs with the respective tables. Because of the one-to-many
relationship between an instance and (system_)metadata records, doing
the database query this way can result in a large number of additional
rows being returned unnecessarily and cause a large data transfer.
This is similar to the problem addressed by change
I0610fb16ccce2ee95c318589c8abcc30613a3fe9 which added separate queries
for (system_)metadata when we "get" multiple instances. We don't,
however, reuse the same code for this change because
_instances_fill_metadata converts the instance database object to a
dict, and some callers of _instance_get_by_uuid need to be able to
access an instance database object attached to the session (example:
instance_update_and_get_original).
By using subqueryload() [1], we can perform the additional queries for
(system_)metadata to solve the problem with a similar approach.
Closes-Bug: #1799298
[1] https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#subquery-eager-loading
Change-Id: I5c071f70f669966e9807b38e99077c1cae5b4606
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1799298
Title:
Metadata API cross joining instance_metadata and
instance_system_metadata
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) ocata series:
Triaged
Status in OpenStack Compute (nova) pike series:
Triaged
Status in OpenStack Compute (nova) queens series:
Triaged
Status in OpenStack Compute (nova) rocky series:
Triaged
Status in OpenStack Compute (nova) stein series:
New
Status in OpenStack Compute (nova) train series:
New
Status in OpenStack Compute (nova) ussuri series:
New
Status in OpenStack Compute (nova) victoria series:
New
Status in OpenStack Security Advisory:
Incomplete
Bug description:
Description
===========
While troubleshooting a production issue we identified that the Nova
metadata API is fetching a lot more raw data from the database than
seems necessary. The problem appears to be caused by the SQL query
used to fetch instance data, which joins the "instance" table with,
among others, two metadata tables: "instance_metadata" and
"instance_system_metadata". Below is a simplified version of this
query which was captured by adding extra logging (the full query is
listed at the end of this bug report):
SELECT ...
FROM (SELECT ...
FROM `instances`
WHERE `instances` . `deleted` = ?
AND `instances` . `uuid` = ?
LIMIT ?) AS `anon_1`
LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1`
ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . `instance_uuid`
LEFT OUTER JOIN (`security_group_instance_association` AS `security_group_instance_association_1`
INNER JOIN `security_groups` AS `security_groups_1`
ON `security_groups_1` . `id` = `security_group_instance_association_1` . `security_group_id`
AND `security_group_instance_association_1` . `deleted` = ?
AND `security_groups_1` . `deleted` = ? )
ON `security_group_instance_association_1` . `instance_uuid` = `anon_1` . `instances_uuid`
AND `anon_1` . `instances_deleted` = ?
LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` . `id`
AND `security_group_rules_1` . `deleted` = ?
LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . `instances_uuid`
LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid`
AND `instance_metadata_1` . `deleted` = ?
The instance table has a 1-to-many relationship to both
"instance_metadata" and "instance_system_metadata" tables, so the
query is effectively producing a cross join of both metadata tables.
Steps to reproduce
==================
To illustrate the impact of this query, add 2 properties to a running
instance and verify that it has 2 records in "instance_metadata", as
well as other records in "instance_system_metadata" such as base image
properties:
> select instance_uuid,`key`,value from instance_metadata where instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
+--------------------------------------+-----------+--------+
| instance_uuid | key | value |
+--------------------------------------+-----------+--------+
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value |
+--------------------------------------+-----------+--------+
2 rows in set (0.61 sec)
> select instance_uuid,`key`,valusystem_metadata where instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
+------------------------+--------------------------------------+
| key | value |
+------------------------+--------------------------------------+
| image_disk_format | qcow2 |
| image_min_ram | 0 |
| image_min_disk | 20 |
| image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a |
| image_container_format | bare |
+------------------------+--------------------------------------+
5 rows in set (0.00 sec)
For this particular instance, the generated query used by the metadata
API will fetch 10 records from the database:
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
| anon_1_instances_uuid | instance_metadata_1_key | instance_metadata_1_value | instance_system_metadata_1_key | instance_system_metadata_1_value |
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_disk_format | qcow2 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_disk_format | qcow2 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_min_ram | 0 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_min_ram | 0 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_min_disk | 20 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_min_disk | 20 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_container_format | bare |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_container_format | bare |
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
10 rows in set (0.00 sec)
Of course this is only a problem when instances have a lot of metadata
records. An instance with 50 rows in "instance_metadata" and 50 rows
in "instance_system_metadata" will fetch 50 x 50 = 2,500 rows from the
database. It's not difficult to see how this can escalate quickly.
This can be a particularly significant problem in a HA scenario with
multiple API nodes pulling data from multiple database nodes.
This issue is affecting clusters running OpenStack Mitaka. I verified
that this is not an issue on clusters running Icehouse because, in
Icehouse, instance data is pulled as needed, executing separate
queries for each table. However, I as far as I could see, this issue
could be affecting every release since Mitaka.
Full SQL
========
The generated SQL query below was captured by adding extra logging to
_build_instance_get in
https://github.com/openstack/nova/blob/mitaka-
eol/nova/db/sqlalchemy/api.py#L2005
SELECT `anon_1` . `instances_created_at` AS `anon_1_instances_created_at`,
`anon_1` . `instances_updated_at` AS `anon_1_instances_updated_at`,
`anon_1` . `instances_deleted_at` AS `anon_1_instances_deleted_at`,
`anon_1` . `instances_deleted` AS `anon_1_instances_deleted`,
`anon_1` . `instances_id` AS `anon_1_instances_id`,
`anon_1` . `instances_user_id` AS `anon_1_instances_user_id`,
`anon_1` . `instances_project_id` AS `anon_1_instances_project_id`,
`anon_1` . `instances_image_ref` AS `anon_1_instances_image_ref`,
`anon_1` . `instances_kernel_id` AS `anon_1_instances_kernel_id`,
`anon_1` . `instances_ramdisk_id` AS `anon_1_instances_ramdisk_id`,
`anon_1` . `instances_hostname` AS `anon_1_instances_hostname`,
`anon_1` . `instances_launch_index` AS `anon_1_instances_launch_index`,
`anon_1` . `instances_key_name` AS `anon_1_instances_key_name`,
`anon_1` . `instances_key_data` AS `anon_1_instances_key_data`,
`anon_1` . `instances_power_state` AS `anon_1_instances_power_state`,
`anon_1` . `instances_vm_state` AS `anon_1_instances_vm_state`,
`anon_1` . `instances_task_state` AS `anon_1_instances_task_state`,
`anon_1` . `instances_memory_mb` AS `anon_1_instances_memory_mb`,
`anon_1` . `instances_vcpus` AS `anon_1_instances_vcpus`,
`anon_1` . `instances_root_gb` AS `anon_1_instances_root_gb`,
`anon_1` . `instances_ephemeral_gb` AS `anon_1_instances_ephemeral_gb`,
`anon_1` . `instances_ephemeral_key_uuid` AS `anon_1_instances_ephemeral_key_uuid`,
`anon_1` . `instances_host` AS `anon_1_instances_host`,
`anon_1` . `instances_node` AS `anon_1_instances_node`,
`anon_1` . `instances_instance_type_id` AS `anon_1_instances_instance_type_id`,
`anon_1` . `instances_user_data` AS `anon_1_instances_user_data`,
`anon_1` . `instances_reservation_id` AS `anon_1_instances_reservation_id`,
`anon_1` . `instances_launched_at` AS `anon_1_instances_launched_at`,
`anon_1` . `instances_terminated_at` AS `anon_1_instances_terminated_at`,
`anon_1` . `instances_availability_zone` AS `anon_1_instances_availability_zone`,
`anon_1` . `instances_display_name` AS `anon_1_instances_display_name`,
`anon_1` . `instances_display_description` AS `anon_1_instances_display_description`,
`anon_1` . `instances_launched_on` AS `anon_1_instances_launched_on`,
`anon_1` . `instances_locked` AS `anon_1_instances_locked`,
`anon_1` . `instances_locked_by` AS `anon_1_instances_locked_by`,
`anon_1` . `instances_os_type` AS `anon_1_instances_os_type`,
`anon_1` . `instances_architecture` AS `anon_1_instances_architecture`,
`anon_1` . `instances_vm_mode` AS `anon_1_instances_vm_mode`,
`anon_1` . `instances_uuid` AS `anon_1_instances_uuid`,
`anon_1` . `instances_root_device_name` AS `anon_1_instances_root_device_name`,
`anon_1` . `instances_default_ephemeral_device` AS `anon_1_instances_default_ephemeral_device`,
`anon_1` . `instances_default_swap_device` AS `anon_1_instances_default_swap_device`,
`anon_1` . `instances_config_drive` AS `anon_1_instances_config_drive`,
`anon_1` . `instances_access_ip_v4` AS `anon_1_instances_access_ip_v4`,
`anon_1` . `instances_access_ip_v6` AS `anon_1_instances_access_ip_v6`,
`anon_1` . `instances_auto_disk_config` AS `anon_1_instances_auto_disk_config`,
`anon_1` . `instances_progress` AS `anon_1_instances_progress`,
`anon_1` . `instances_shutdown_terminate` AS `anon_1_instances_shutdown_terminate`,
`anon_1` . `instances_disable_terminate` AS `anon_1_instances_disable_terminate`,
`anon_1` . `instances_cell_name` AS `anon_1_instances_cell_name`,
`anon_1` . `instances_internal_id` AS `anon_1_instances_internal_id`,
`anon_1` . `instances_cleaned` AS `anon_1_instances_cleaned`,
`instance_system_metadata_1` . `created_at` AS `instance_system_metadata_1_created_at`,
`instance_system_metadata_1` . `updated_at` AS `instance_system_metadata_1_updated_at`,
`instance_system_metadata_1` . `deleted_at` AS `instance_system_metadata_1_deleted_at`,
`instance_system_metadata_1` . `deleted` AS `instance_system_metadata_1_deleted`,
`instance_system_metadata_1` . `id` AS `instance_system_metadata_1_id`,
`instance_system_metadata_1` . `key` AS `instance_system_metadata_1_key`,
`instance_system_metadata_1` . `value` AS `instance_system_metadata_1_value`,
`instance_system_metadata_1` . `instance_uuid` AS `instance_system_metadata_1_instance_uuid`,
`security_groups_1` . `created_at` AS `security_groups_1_created_at`,
`security_groups_1` . `updated_at` AS `security_groups_1_updated_at`,
`security_groups_1` . `deleted_at` AS `security_groups_1_deleted_at`,
`security_groups_1` . `deleted` AS `security_groups_1_deleted`,
`security_groups_1` . `id` AS `security_groups_1_id`,
`security_groups_1` . `name` AS `security_groups_1_name`,
`security_groups_1` . `description` AS `security_groups_1_description`,
`security_groups_1` . `user_id` AS `security_groups_1_user_id`,
`security_groups_1` . `project_id` AS `security_groups_1_project_id`,
`security_group_rules_1` . `created_at` AS `security_group_rules_1_created_at`,
`security_group_rules_1` . `updated_at` AS `security_group_rules_1_updated_at`,
`security_group_rules_1` . `deleted_at` AS `security_group_rules_1_deleted_at`,
`security_group_rules_1` . `deleted` AS `security_group_rules_1_deleted`,
`security_group_rules_1` . `id` AS `security_group_rules_1_id`,
`security_group_rules_1` . `parent_group_id` AS `security_group_rules_1_parent_group_id`,
`security_group_rules_1` . `protocol` AS `security_group_rules_1_protocol`,
`security_group_rules_1` . `from_port` AS `security_group_rules_1_from_port`,
`security_group_rules_1` . `to_port` AS `security_group_rules_1_to_port`,
`security_group_rules_1` . `cidr` AS `security_group_rules_1_cidr`,
`security_group_rules_1` . `group_id` AS `security_group_rules_1_group_id`,
`instance_info_caches_1` . `created_at` AS `instance_info_caches_1_created_at`,
`instance_info_caches_1` . `updated_at` AS `instance_info_caches_1_updated_at`,
`instance_info_caches_1` . `deleted_at` AS `instance_info_caches_1_deleted_at`,
`instance_info_caches_1` . `deleted` AS `instance_info_caches_1_deleted`,
`instance_info_caches_1` . `id` AS `instance_info_caches_1_id`,
`instance_info_caches_1` . `network_info` AS `instance_info_caches_1_network_info`,
`instance_info_caches_1` . `instance_uuid` AS `instance_info_caches_1_instance_uuid`,
`instance_extra_1` . `flavor` AS `instance_extra_1_flavor`,
`instance_extra_1` . `created_at` AS `instance_extra_1_created_at`,
`instance_extra_1` . `updated_at` AS `instance_extra_1_updated_at`,
`instance_extra_1` . `deleted_at` AS `instance_extra_1_deleted_at`,
`instance_extra_1` . `deleted` AS `instance_extra_1_deleted`,
`instance_extra_1` . `id` AS `instance_extra_1_id`,
`instance_extra_1` . `instance_uuid` AS `instance_extra_1_instance_uuid`,
`instance_metadata_1` . `created_at` AS `instance_metadata_1_created_at`,
`instance_metadata_1` . `updated_at` AS `instance_metadata_1_updated_at`,
`instance_metadata_1` . `deleted_at` AS `instance_metadata_1_deleted_at`,
`instance_metadata_1` . `deleted` AS `instance_metadata_1_deleted`,
`instance_metadata_1` . `id` AS `instance_metadata_1_id`,
`instance_metadata_1` . `key` AS `instance_metadata_1_key`,
`instance_metadata_1` . `value` AS `instance_metadata_1_value`,
`instance_metadata_1` . `instance_uuid` AS `instance_metadata_1_instance_uuid`
FROM (SELECT `instances` . `created_at` AS `instances_created_at`,
`instances` . `updated_at` AS `instances_updated_at`,
`instances` . `deleted_at` AS `instances_deleted_at`,
`instances` . `deleted` AS `instances_deleted`,
`instances` . `id` AS `instances_id`,
`instances` . `user_id` AS `instances_user_id`,
`instances` . `project_id` AS `instances_project_id`,
`instances` . `image_ref` AS `instances_image_ref`,
`instances` . `kernel_id` AS `instances_kernel_id`,
`instances` . `ramdisk_id` AS `instances_ramdisk_id`,
`instances` . `hostname` AS `instances_hostname`,
`instances` . `launch_index` AS `instances_launch_index`,
`instances` . `key_name` AS `instances_key_name`,
`instances` . `key_data` AS `instances_key_data`,
`instances` . `power_state` AS `instances_power_state`,
`instances` . `vm_state` AS `instances_vm_state`,
`instances` . `task_state` AS `instances_task_state`,
`instances` . `memory_mb` AS `instances_memory_mb`,
`instances` . `vcpus` AS `instances_vcpus`,
`instances` . `root_gb` AS `instances_root_gb`,
`instances` . `ephemeral_gb` AS `instances_ephemeral_gb`,
`instances` . `ephemeral_key_uuid` AS `instances_ephemeral_key_uuid`,
`instances` . `host` AS `instances_host`,
`instances` . `node` AS `instances_node`,
`instances` . `instance_type_id` AS `instances_instance_type_id`,
`instances` . `user_data` AS `instances_user_data`,
`instances` . `reservation_id` AS `instances_reservation_id`,
`instances` . `launched_at` AS `instances_launched_at`,
`instances` . `terminated_at` AS `instances_terminated_at`,
`instances` . `availability_zone` AS `instances_availability_zone`,
`instances` . `display_name` AS `instances_display_name`,
`instances` . `display_description` AS `instances_display_description`,
`instances` . `launched_on` AS `instances_launched_on`,
`instances` . `locked` AS `instances_locked`,
`instances` . `locked_by` AS `instances_locked_by`,
`instances` . `os_type` AS `instances_os_type`,
`instances` . `architecture` AS `instances_architecture`,
`instances` . `vm_mode` AS `instances_vm_mode`,
`instances` . `uuid` AS `instances_uuid`,
`instances` . `root_device_name` AS `instances_root_device_name`,
`instances` . `default_ephemeral_device` AS `instances_default_ephemeral_device`,
`instances` . `default_swap_device` AS `instances_default_swap_device`,
`instances` . `config_drive` AS `instances_config_drive`,
`instances` . `access_ip_v4` AS `instances_access_ip_v4`,
`instances` . `access_ip_v6` AS `instances_access_ip_v6`,
`instances` . `auto_disk_config` AS `instances_auto_disk_config`,
`instances` . `progress` AS `instances_progress`,
`instances` . `shutdown_terminate` AS `instances_shutdown_terminate`,
`instances` . `disable_terminate` AS `instances_disable_terminate`,
`instances` . `cell_name` AS `instances_cell_name`,
`instances` . `internal_id` AS `instances_internal_id`,
`instances` . `cleaned` AS `instances_cleaned`
FROM `instances`
WHERE `instances` . `deleted` = ?
AND `instances` . `uuid` = ?
LIMIT ?) AS `anon_1`
LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1`
ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . `instance_uuid`
LEFT OUTER JOIN (`security_group_instance_association` AS `security_group_instance_association_1`
INNER JOIN `security_groups` AS `security_groups_1`
ON `security_groups_1` . `id` = `security_group_instance_association_1` . `security_group_id`
AND `security_group_instance_association_1` . `deleted` = ?
AND `security_groups_1` . `deleted` = ? )
ON `security_group_instance_association_1` . `instance_uuid` = `anon_1` . `instances_uuid`
AND `anon_1` . `instances_deleted` = ?
LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` . `id`
AND `security_group_rules_1` . `deleted` = ?
LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . `instances_uuid`
LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid`
AND `instance_metadata_1` . `deleted` = ?
Environment
===========
1. Exact version of OpenStack you are running.
OpenStack Mitaka
openstack-nova-api.noarch 1:13.1.4-1.el7
Code suggests this could be affecting every release since Mitaka.
2. Which hypervisor did you use?
qemu-kvm.x86_64 10:1.5.3-141.el7_4.6
qemu-kvm-common.x86_64 10:1.5.3-141.el7_4.6
Reference
=========
Discussion on the OpenStack dev mailing list:
http://lists.openstack.org/pipermail/openstack-
dev/2018-October/135945.html
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1799298/+subscriptions
References