yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93134
[Bug 2044703] [NEW] Instance evacuation is stuck and completes after a long time
Public bug reported:
Stack:
3 node cluster on CentOS 8 Stream, Libvirt+KVM
Openstack Xena.
Nova packages version - 24.1.1-1
SAN based Shared storage connected through ISCSI
MariaDB Galera Cluster configured in Active-Backup mode on haproxy.
We are observing an issue with instance evacuation when a host with active database is powered off.
Evacuation task is triggered but it is taking more than 15 minutes. At the same time, we see placement and other Openstack services report 504 gateway timeout errors when talking with keystone. We saw that the major part of time is spent in nova scheduler from the following log :
Request filter 'map_az_to_placement_aggregate' took 972.3 seconds
wrapper /usr/lib/python3.6/site-
packages/nova/scheduler/request_filter.py:47
Sometimes rebuild task is stuck forever until we restart nova services and force VM to go to an error state.
Please let me know if any other configurations or logs are required to understand this issue.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2044703
Title:
Instance evacuation is stuck and completes after a long time
Status in OpenStack Compute (nova):
New
Bug description:
Stack:
3 node cluster on CentOS 8 Stream, Libvirt+KVM
Openstack Xena.
Nova packages version - 24.1.1-1
SAN based Shared storage connected through ISCSI
MariaDB Galera Cluster configured in Active-Backup mode on haproxy.
We are observing an issue with instance evacuation when a host with active database is powered off.
Evacuation task is triggered but it is taking more than 15 minutes. At the same time, we see placement and other Openstack services report 504 gateway timeout errors when talking with keystone. We saw that the major part of time is spent in nova scheduler from the following log :
Request filter 'map_az_to_placement_aggregate' took 972.3 seconds
wrapper /usr/lib/python3.6/site-
packages/nova/scheduler/request_filter.py:47
Sometimes rebuild task is stuck forever until we restart nova services and force VM to go to an error state.
Please let me know if any other configurations or logs are required to understand this issue.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2044703/+subscriptions