yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #86660
[Bug 1936574] Re: nova-compute SSL connections make rabbitmq pods OOM
** Also affects: rabbitmq
Importance: Undecided
Status: New
** Also affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1936574
Title:
nova-compute SSL connections make rabbitmq pods OOM
Status in OpenStack Compute (nova):
New
Status in oslo.messaging:
New
Status in RabbitMQ:
New
Bug description:
we have an Rocky openstack deployment that includes 3 controller and
500 computes.just at 15:58,nova-compute detect that rabbitmq
connection was broken ,then reconnected.
2021-07-05 15:58:28.633 8 ERROR oslo.messaging._drivers.impl_rabbit [req-a09d4a8b-c24b-4b30-b433-64fe4f6bace5 - - - - -] [8ed1f425-ad67-4b98-874c-e4516aaf3134] AMQP server on 145.247.103.16:5671 is unreachable: . Trying again in 1 seconds.: timeout
2021-07-05 15:58:29.656 8 INFO oslo.messaging._drivers.impl_rabbit [req-a09d4a8b-c24b-4b30-b433-64fe4f6bace5 - - - - -] [8ed1f425-ad67-4b98-874c-e4516aaf3134] Reconnected to AMQP server on 145.247.103.16:5671 via [amqp] client with port 28205.
then rabbitmq report huge connections was closed by client.
=WARNING REPORT==== 5-Jul-2021::15:57:59 ===
closing AMQP connection <0.6345.754> (20.16.36.44:2451 -> 145.247.103.14:5671 - nova-compute:8:b4ce7b09-b9b5-4db1-983b-a071dc031c64, vhost: '/', user: 'openstack'):
client unexpectedly closed TCP connection
after 10 minutes ,cluster was blocked with 0.4 memory watermark.
=INFO REPORT==== 5-Jul-2021::16:19:29 ===
vm_memory_high_watermark set. Memory used:111358541824 allowed:107949065830
**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
However ,after the publishers were bloked ,rabbitmq pod still result
in memory leaking,in the end, the node OOM,system force pod to
restart.
amqp release : 2.5.2
oslo-messaging release :8.1.4
openstack : Rocky
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1936574/+subscriptions