← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1943793] [NEW] nova-api logs are spammed with oslo.messaging errors

 

Public bug reported:

after updating from ussuri to victoria, the nova-api.logs are spammed
with the following errors and infos:

nova-api.log:
2021-09-16 05:34:23.402 19 ERROR oslo.messaging._drivers.impl_rabbit [-] [ac436422-607b-4ca5-941a-b70dbfe6be3d] AMQP server on xxx.xxx.xxx.xxx:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection

2021-09-16 05:35:24.696 22 INFO oslo.messaging._drivers.impl_rabbit [-]
A recoverable connection/channel error occurred, trying to reconnect:
[Errno 104] Connection reset by peer

rabbit@xxxxxxxxxxxxx:
2021-09-16 05:09:26.821 [error] <0.31945.6> closing AMQP connection <0.31945.6> (xxx.xxx.xxx.xxx:38108 -> xxx.xxx.xxx.xxx:5672 - mod_wsgi:22:835db11a-efbd-4005-ad89-3d7a891dfa27):
missed heartbeats from client, timeout: 60s

after restarting nova-api the error is gone for about 20 to 30min. After
that the errors get more and more from 1-3 errors in 5m to 50-60 errors
in 5m after one day.

we tried the following settings with no effect:
rpc_response_timeout = 180

and
heartbeat_rate = 4
heartbeat_timeout_threshold = 120

befor upgrading to victoria we had no errors.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1943793

Title:
  nova-api logs are spammed with oslo.messaging errors

Status in OpenStack Compute (nova):
  New

Bug description:
  after updating from ussuri to victoria, the nova-api.logs are spammed
  with the following errors and infos:

  nova-api.log:
  2021-09-16 05:34:23.402 19 ERROR oslo.messaging._drivers.impl_rabbit [-] [ac436422-607b-4ca5-941a-b70dbfe6be3d] AMQP server on xxx.xxx.xxx.xxx:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection

  2021-09-16 05:35:24.696 22 INFO oslo.messaging._drivers.impl_rabbit
  [-] A recoverable connection/channel error occurred, trying to
  reconnect: [Errno 104] Connection reset by peer

  rabbit@xxxxxxxxxxxxx:
  2021-09-16 05:09:26.821 [error] <0.31945.6> closing AMQP connection <0.31945.6> (xxx.xxx.xxx.xxx:38108 -> xxx.xxx.xxx.xxx:5672 - mod_wsgi:22:835db11a-efbd-4005-ad89-3d7a891dfa27):
  missed heartbeats from client, timeout: 60s

  after restarting nova-api the error is gone for about 20 to 30min.
  After that the errors get more and more from 1-3 errors in 5m to 50-60
  errors in 5m after one day.

  we tried the following settings with no effect:
  rpc_response_timeout = 180

  and
  heartbeat_rate = 4
  heartbeat_timeout_threshold = 120

  befor upgrading to victoria we had no errors.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1943793/+subscriptions



Follow ups