yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #47567
[Bug 1553184] Re: Slow Query on Instances Table
Reviewed: https://review.openstack.org/288765
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9bc935d7d059c5ad9ff563b373691c3134e8f3ac
Submitter: Jenkins
Branch: master
commit 9bc935d7d059c5ad9ff563b373691c3134e8f3ac
Author: Matt Riedemann <mriedem@xxxxxxxxxx>
Date: Fri Mar 4 17:07:41 2016 -0500
Add index on instances table across deleted/created_at columns
The process_sort_params method, used by instance_get_all_by_filters_sort,
defaults to ordering by the created_at and id columns.
Adding an index on the deleted and created_at columns significantly
reduces the time it takes to run the query on the instances table.
From the bug report:
Instead of 993,126 rows, it only grabbed 1490.
And drops from 10.61 seconds to 0.28 seconds on the data set in
the bug report.
The index was provided by Joseph bajin <josephbajin@xxxxxxxxx>
in the bug report.
Change-Id: I25ec8a0b8124d77926d938dc911464cc8b9d2945
Closes-Bug: #1553184
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1553184
Title:
Slow Query on Instances Table
Status in OpenStack Compute (nova):
Fix Released
Bug description:
We are currently running Juno (2014.2.4).
We have a few tenants that have a lot of instances that are created and deleted so they have a lot of entries in the instances table.
I see the following bug that was not brought into Juno but was
delivered in Kilo that was suppose to help with this same type of
issue, but it didn't seem like that worked at all.
https://bugs.launchpad.net/nova/+bug/1378395
After reviewing the query a bit more, I found that another Index could
be added that dramatically dropped the time the query took to run.
This the query in mention:
SELECT anon_1.instances_deleted_at AS anon_1_instances_deleted_at, anon_1.instances_deleted AS anon_1_instances_deleted, anon_1.instances_created_at AS anon_1_instances_created_at, anon_1.instances_updated_at AS anon_1_instances_updated_at, anon_1.instances_id AS anon_1_instances_id, anon_1.instances_user_id AS anon_1_instances_user_id, anon_1.instances_project_id AS anon_1_instances_project_id, anon_1.instances_image_ref AS anon_1_instances_image_ref, anon_1.instances_kernel_id AS anon_1_instances_kernel_id, anon_1.instances_ramdisk_id AS anon_1_instances_ramdisk_id, anon_1.instances_hostname AS anon_1_instances_hostname, anon_1.instances_launch_index AS anon_1_instances_launch_index, anon_1.instances_key_name AS anon_1_instances_key_name, anon_1.instances_key_data AS anon_1_instances_key_data, anon_1.instances_power_state AS anon_1_instances_power_state, anon_1.instances_vm_state AS anon_1_instances_vm_state, anon_1.instances_task_state AS anon_1_instances_task_state, anon_1.instances_memory_mb AS anon_1_instances_memory_mb, anon_1.instances_vcpus AS anon_1_instances_vcpus, anon_1.instances_root_gb AS anon_1_instances_root_gb, anon_1.instances_ephemeral_gb AS anon_1_instances_ephemeral_gb, anon_1.instances_ephemeral_key_uuid AS anon_1_instances_ephemeral_key_uuid, anon_1.instances_host AS anon_1_instances_host, anon_1.instances_node AS anon_1_instances_node, anon_1.instances_instance_type_id AS anon_1_instances_instance_type_id, anon_1.instances_user_data AS anon_1_instances_user_data, anon_1.instances_reservation_id AS anon_1_instances_reservation_id, anon_1.instances_scheduled_at AS anon_1_instances_scheduled_at, anon_1.instances_launched_at AS anon_1_instances_launched_at, anon_1.instances_terminated_at AS anon_1_instances_terminated_at, anon_1.instances_availability_zone AS anon_1_instances_availability_zone, anon_1.instances_display_name AS anon_1_instances_display_name, anon_1.instances_display_description AS anon_1_instances_display_description, anon_1.instances_launched_on AS anon_1_instances_launched_on, anon_1.instances_locked AS anon_1_instances_locked, anon_1.instances_locked_by AS anon_1_instances_locked_by, anon_1.instances_os_type AS anon_1_instances_os_type, anon_1.instances_architecture AS anon_1_instances_architecture, anon_1.instances_vm_mode AS anon_1_instances_vm_mode, anon_1.instances_uuid AS anon_1_instances_uuid, anon_1.instances_root_device_name AS anon_1_instances_root_device_name, anon_1.instances_default_ephemeral_device AS anon_1_instances_default_ephemeral_device, anon_1.instances_default_swap_device AS anon_1_instances_default_swap_device, anon_1.instances_config_drive AS anon_1_instances_config_drive, anon_1.instances_access_ip_v4 AS anon_1_instances_access_ip_v4, anon_1.instances_access_ip_v6 AS anon_1_instances_access_ip_v6, anon_1.instances_auto_disk_config AS anon_1_instances_auto_disk_config, anon_1.instances_progress AS anon_1_instances_progress, anon_1.instances_shutdown_terminate AS anon_1_instances_shutdown_terminate, anon_1.instances_disable_terminate AS anon_1_instances_disable_terminate, anon_1.instances_cell_name AS anon_1_instances_cell_name, anon_1.instances_internal_id AS anon_1_instances_internal_id, anon_1.instances_cleaned AS anon_1_instances_cleaned, instance_info_caches_1.deleted_at AS instance_info_caches_1_deleted_at, instance_info_caches_1.deleted AS instance_info_caches_1_deleted, instance_info_caches_1.created_at AS instance_info_caches_1_created_at, instance_info_caches_1.updated_at AS instance_info_caches_1_updated_at, instance_info_caches_1.id AS instance_info_caches_1_id, instance_info_caches_1.network_info AS instance_info_caches_1_network_info, instance_info_caches_1.instance_uuid AS instance_info_caches_1_instance_uuid, security_groups_1.deleted_at AS security_groups_1_deleted_at, security_groups_1.deleted AS security_groups_1_deleted, security_groups_1.created_at AS security_groups_1_created_at, security_groups_1.updated_at AS security_groups_1_updated_at, security_groups_1.id AS security_groups_1_id, security_groups_1.name AS security_groups_1_name, security_groups_1.description AS security_groups_1_description, security_groups_1.user_id AS security_groups_1_user_id, security_groups_1.project_id AS security_groups_1_project_id
FROM (SELECT instances.deleted_at AS instances_deleted_at, instances.deleted AS instances_deleted, instances.created_at AS instances_created_at, instances.updated_at AS instances_updated_at, instances.id AS instances_id, instances.user_id AS instances_user_id, instances.project_id AS instances_project_id, instances.image_ref AS instances_image_ref, instances.kernel_id AS instances_kernel_id, instances.ramdisk_id AS instances_ramdisk_id, instances.hostname AS instances_hostname, instances.launch_index AS instances_launch_index, instances.key_name AS instances_key_name, instances.key_data AS instances_key_data, instances.power_state AS instances_power_state, instances.vm_state AS instances_vm_state, instances.task_state AS instances_task_state, instances.memory_mb AS instances_memory_mb, instances.vcpus AS instances_vcpus, instances.root_gb AS instances_root_gb, instances.ephemeral_gb AS instances_ephemeral_gb, instances.ephemeral_key_uuid AS instances_ephemeral_key_uuid, instances.host AS instances_host, instances.node AS instances_node, instances.instance_type_id AS instances_instance_type_id, instances.user_data AS instances_user_data, instances.reservation_id AS instances_reservation_id, instances.scheduled_at AS instances_scheduled_at, instances.launched_at AS instances_launched_at, instances.terminated_at AS instances_terminated_at, instances.availability_zone AS instances_availability_zone, instances.display_name AS instances_display_name, instances.display_description AS instances_display_description, instances.launched_on AS instances_launched_on, instances.locked AS instances_locked, instances.locked_by AS instances_locked_by, instances.os_type AS instances_os_type, instances.architecture AS instances_architecture, instances.vm_mode AS instances_vm_mode, instances.uuid AS instances_uuid, instances.root_device_name AS instances_root_device_name, instances.default_ephemeral_device AS instances_default_ephemeral_device, instances.default_swap_device AS instances_default_swap_device, instances.config_drive AS instances_config_drive, instances.access_ip_v4 AS instances_access_ip_v4, instances.access_ip_v6 AS instances_access_ip_v6, instances.auto_disk_config AS instances_auto_disk_config, instances.progress AS instances_progress, instances.shutdown_terminate AS instances_shutdown_terminate, instances.disable_terminate AS instances_disable_terminate, instances.cell_name AS instances_cell_name, instances.internal_id AS instances_internal_id, instances.cleaned AS instances_cleaned
FROM instances
WHERE instances.deleted = 0 AND (instances.vm_state != 'soft-delete' OR instances.vm_state IS NULL) ORDER BY instances.created_at DESC, instances.created_at DESC, instances.created_at DESC, instances.id DESC
LIMIT 1000) AS anon_1 LEFT OUTER JOIN instance_info_caches AS instance_info_caches_1 ON instance_info_caches_1.instance_uuid = anon_1.instances_uuid LEFT OUTER JOIN (security_group_instance_association AS security_group_instance_association_1 INNER JOIN security_groups AS security_groups_1 ON security_groups_1.id = security_group_instance_association_1.security_group_id AND security_group_instance_association_1.deleted = 0 AND security_groups_1.deleted = 0) ON security_group_instance_association_1.instance_uuid = anon_1.instances_uuid AND anon_1.instances_deleted = 0 ORDER BY anon_1.instances_created_at DESC, anon_1.instances_created_at DESC, anon_1.instances_created_at DESC, anon_1.instances_id DESC
This is what the explain plan looks like pre-indexes:
Explain Before Indexes:
+----+-------------+---------------------------------------+--------+-------------------------------------------------------------------------+-------------------------------------------------------+---------+--------------------------------------------------------------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------------------------+--------+-------------------------------------------------------------------------+-------------------------------------------------------+---------+--------------------------------------------------------------+--------+-----------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 994 | Using filesort |
| 1 | PRIMARY | instance_info_caches_1 | eq_ref | uniq_instance_info_caches0instance_uuid | uniq_instance_info_caches0instance_uuid | 110 | anon_1.instances_uuid | 1 | |
| 1 | PRIMARY | security_group_instance_association_1 | ref | security_group_id,security_group_instance_association_instance_uuid_idx | security_group_instance_association_instance_uuid_idx | 111 | anon_1.instances_uuid | 1 | |
| 1 | PRIMARY | security_groups_1 | eq_ref | PRIMARY | PRIMARY | 4 | nova.security_group_instance_association_1.security_group_id | 1 | |
| 2 | DERIVED | instances | ALL | NULL | NULL | NULL | NULL | 993123 | Using where; Using filesort |
+----+-------------+---------------------------------------+--------+-------------------------------------------------------------------------+-------------------------------------------------------+---------+--------------------------------------------------------------+--------+-----------------------------+
5 rows in set (10.61 sec)
You can see that it takes 10.61 second to return this.
After reviewing the full query, I saw this derived query that was causing most of the issues. This was the query:
SELECT instances.deleted_at AS instances_deleted_at, instances.deleted AS instances_deleted, instances.created_at AS instances_created_at, instances.updated_at AS instances_updated_at, instances.id AS instances_id, instances.user_id AS instances_user_id, instances.project_id AS instances_project_id, instances.image_ref AS instances_image_ref, instances.kernel_id AS instances_kernel_id, instances.ramdisk_id AS instances_ramdisk_id, instances.hostname AS instances_hostname, instances.launch_index AS instances_launch_index, instances.key_name AS instances_key_name, instances.key_data AS instances_key_data, instances.power_state AS instances_power_state, instances.vm_state AS instances_vm_state, instances.task_state AS instances_task_state, instances.memory_mb AS instances_memory_mb, instances.vcpus AS instances_vcpus, instances.root_gb AS instances_root_gb, instances.ephemeral_gb AS instances_ephemeral_gb, instances.ephemeral_key_uuid AS instances_ephemeral_key_uuid, instances.host AS instances_host, instances.node AS instances_node, instances.instance_type_id AS instances_instance_type_id, instances.user_data AS instances_user_data, instances.reservation_id AS instances_reservation_id, instances.scheduled_at AS instances_scheduled_at, instances.launched_at AS instances_launched_at, instances.terminated_at AS instances_terminated_at, instances.availability_zone AS instances_availability_zone, instances.display_name AS instances_display_name, instances.display_description AS instances_display_description, instances.launched_on AS instances_launched_on, instances.locked AS instances_locked, instances.locked_by AS instances_locked_by, instances.os_type AS instances_os_type, instances.architecture AS instances_architecture, instances.vm_mode AS instances_vm_mode, instances.uuid AS instances_uuid, instances.root_device_name AS instances_root_device_name, instances.default_ephemeral_device AS instances_default_ephemeral_device, instances.default_swap_device AS instances_default_swap_device, instances.config_drive AS instances_config_drive, instances.access_ip_v4 AS instances_access_ip_v4, instances.access_ip_v6 AS instances_access_ip_v6, instances.auto_disk_config AS instances_auto_disk_config, instances.progress AS instances_progress, instances.shutdown_terminate AS instances_shutdown_terminate, instances.disable_terminate AS instances_disable_terminate, instances.cell_name AS instances_cell_name, instances.internal_id AS instances_internal_id, instances.cleaned AS instances_cleaned
FROM instances
WHERE instances.deleted = 0 AND (instances.vm_state != 'soft-delete' OR instances.vm_state IS NULL) ORDER BY instances.created_at DESC, instances.created_at DESC, instances.created_at DESC, instances.id DESC
LIMIT 1000)
This is what the explain plan looked like with this query:
+----+-------------+-----------+------+---------------+------+---------+------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+------+---------+------+--------+-----------------------------+
| 1 | SIMPLE | instances | ALL | NULL | NULL | NULL | NULL | 993126 | Using where; Using filesort |
+----+-------------+-----------+------+---------------+------+---------+------+--------+-----------------------------+
1 row in set (0.00 sec)
That's a lot of data to sort through.
I went ahead and added this index which is found in the above mentioned bug:
CREATE INDEX `instances_project_id_deleted_idx` ON `instances` (`project_id`, `deleted`);
and that didn't seem to solve the problem.
As I kept digging into that derived query some more, I found that it didn't have any index it could use because of the ORDER BY statement.
Adding this index to the table looks to have resolved the issue:
CREATE INDEX `instances_deleted_created_at_idx` ON `instances` (`deleted`, `created_at`);
That made the explain plan for the derived query look like this:
+----+-------------+-----------+------+----------------------------------+----------------------------------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+----------------------------------+----------------------------------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | instances | ref | instances_deleted_created_at_idx | instances_deleted_created_at_idx | 5 | const | 1490 | Using where; Using filesort |
+----+-------------+-----------+------+----------------------------------+----------------------------------+---------+-------+------+-----------------------------+
Instead of 993,126 rows, it only grabbed 1490.
This made the entire large query explain plan look like this:
+----+-------------+---------------------------------------+--------+-------------------------------------------------------------------------+-------------------------------------------------------+---------+--------------------------------------------------------------+---------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------------------------+--------+-------------------------------------------------------------------------+-------------------------------------------------------+---------+--------------------------------------------------------------+---------+-----------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1000 | Using filesort |
| 1 | PRIMARY | instance_info_caches_1 | eq_ref | uniq_instance_info_caches0instance_uuid | uniq_instance_info_caches0instance_uuid | 110 | anon_1.instances_uuid | 1 | |
| 1 | PRIMARY | security_group_instance_association_1 | ref | security_group_id,security_group_instance_association_instance_uuid_idx | security_group_instance_association_instance_uuid_idx | 111 | anon_1.instances_uuid | 1 | |
| 1 | PRIMARY | security_groups_1 | eq_ref | PRIMARY | PRIMARY | 4 | nova.security_group_instance_association_1.security_group_id | 1 | |
| 2 | DERIVED | instances | ALL | instances_deleted_created_at_idx | instances_deleted_created_at_idx | 5 | | 1328990 | Using where; Using filesort |
+----+-------------+---------------------------------------+--------+-------------------------------------------------------------------------+-------------------------------------------------------+---------+--------------------------------------------------------------+---------+-----------------------------+
5 rows in set (0.28 sec)
Bringing down the query time significantly and now I have no more slow
queries at all.
tl;dr Very large derived query that was giving me 10+ second return
times in nova was reduced down to under 1 second with adding an index
to the instances table.
If someone could direct me to how to get this added, I'm happy to do
all the work, I just don't know what the process is for an index.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1553184/+subscriptions
References