← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1936574] Re: nova-compute SSL connections make rabbitmq pods OOM

 

** Also affects: rabbitmq
   Importance: Undecided
       Status: New

** Also affects: nova
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1936574

Title:
  nova-compute SSL connections  make rabbitmq pods OOM

Status in OpenStack Compute (nova):
  New
Status in oslo.messaging:
  New
Status in RabbitMQ:
  New

Bug description:
  we have an Rocky openstack deployment that includes 3 controller and
  500 computes.just at 15:58,nova-compute detect that rabbitmq
  connection was broken ,then reconnected.

  2021-07-05 15:58:28.633 8 ERROR oslo.messaging._drivers.impl_rabbit [req-a09d4a8b-c24b-4b30-b433-64fe4f6bace5 - - - - -] [8ed1f425-ad67-4b98-874c-e4516aaf3134] AMQP server on 145.247.103.16:5671 is unreachable: . Trying again in 1 seconds.: timeout
  2021-07-05 15:58:29.656 8 INFO oslo.messaging._drivers.impl_rabbit [req-a09d4a8b-c24b-4b30-b433-64fe4f6bace5 - - - - -] [8ed1f425-ad67-4b98-874c-e4516aaf3134] Reconnected to AMQP server on 145.247.103.16:5671 via [amqp] client with port 28205.

  then rabbitmq report huge connections was closed by client.

  =WARNING REPORT==== 5-Jul-2021::15:57:59 ===
  closing AMQP connection <0.6345.754> (20.16.36.44:2451 -> 145.247.103.14:5671 - nova-compute:8:b4ce7b09-b9b5-4db1-983b-a071dc031c64, vhost: '/', user: 'openstack'):
  client unexpectedly closed TCP connection

  after 10 minutes ,cluster was blocked with 0.4 memory watermark.

  =INFO REPORT==== 5-Jul-2021::16:19:29 ===
  vm_memory_high_watermark set. Memory used:111358541824 allowed:107949065830 

  **********************************************************
  *** Publishers will be blocked until this alarm clears ***
  **********************************************************  

  However ,after the publishers were bloked ,rabbitmq pod still result
  in  memory leaking,in the end, the node OOM,system force pod to
  restart.


  amqp release : 2.5.2
  oslo-messaging release :8.1.4
  openstack : Rocky

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1936574/+subscriptions