yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88287
[Bug 1961068] [NEW] nova-ceph-multistore job fails with mysqld got oom-killed
Public bug reported:
Searching through the jobs showed that nova-ceph-multistore job fails
time to time with DB crash due to out of memory error.
In the tempest errors the following message can be seen:
tempest.lib.exceptions.ServerFault: Got server fault
Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'oslo_db.exception.DBConnectionError'>
in mysqld error logs (controller/logs/mysql/error_log.txt) the crash
recovery is visible:
2022-02-15T19:26:40.245179Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
2022-02-15T19:26:40.268204Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
and around that time in syslog (controller/logs/syslog.txt) the Out of
Memory logs can be seen:
Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mysql.service,task=mysqld,pid=67959,uid=116
Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: Out of memory: Killed process 67959 (mysqld) total-vm:5127600kB, anon-rss:756064kB, file-rss:0kB, shmem-rss:0kB, UID:116 pgtables:2388kB oom_score_adj:0
Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom_reaper: reaped process 67959 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
The error only comes in nova-ceph-multistore job. (see recent occurrences via logsearch: https://paste.opendev.org/show/bQNKfoaMafUyNFCyQ0kN/ ) Mostly happens on current master branch (yoga), but example error found in wallaby as well: https://zuul.opendev.org/t/openstack/build/d8a6a9c1496346dda6986db00c06a616
** Affects: nova
Importance: High
Status: Confirmed
** Tags: gate-failure
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1961068
Title:
nova-ceph-multistore job fails with mysqld got oom-killed
Status in OpenStack Compute (nova):
Confirmed
Bug description:
Searching through the jobs showed that nova-ceph-multistore job fails
time to time with DB crash due to out of memory error.
In the tempest errors the following message can be seen:
tempest.lib.exceptions.ServerFault: Got server fault
Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'oslo_db.exception.DBConnectionError'>
in mysqld error logs (controller/logs/mysql/error_log.txt) the crash
recovery is visible:
2022-02-15T19:26:40.245179Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
2022-02-15T19:26:40.268204Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
and around that time in syslog (controller/logs/syslog.txt) the Out of
Memory logs can be seen:
Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mysql.service,task=mysqld,pid=67959,uid=116
Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: Out of memory: Killed process 67959 (mysqld) total-vm:5127600kB, anon-rss:756064kB, file-rss:0kB, shmem-rss:0kB, UID:116 pgtables:2388kB oom_score_adj:0
Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom_reaper: reaped process 67959 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
The error only comes in nova-ceph-multistore job. (see recent occurrences via logsearch: https://paste.opendev.org/show/bQNKfoaMafUyNFCyQ0kN/ ) Mostly happens on current master branch (yoga), but example error found in wallaby as well: https://zuul.opendev.org/t/openstack/build/d8a6a9c1496346dda6986db00c06a616
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1961068/+subscriptions
Follow ups