← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1809823] Re: Neutron_api (unhealthy) after few days

 

Closed due to inactivity. Please feel free to reopen if needed.

** Changed in: neutron
       Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1809823

Title:
  Neutron_api (unhealthy) after few days

Status in neutron:
  Won't Fix
Status in oslo.service:
  Confirmed

Bug description:
  Description
  ===========
  on the undercloud ( pretty sure we also seen it on overcloud, i'll update when sure ) 
  Without any action, we notice that neutron_api service is in "unhealthy" state and stop functioning. 
  Log shows - 
  2018-12-26 00:00:35.774 7 INFO oslo_service.service [-] Caught SIGHUP, stopping children
  2018-12-26 00:00:36.077 40997 ERROR oslo_service.service [-] Error starting thread.: RuntimeError: A fixed interval looping call can only run one function at a time

  openstack commands that needs neutron fails ( e.g openstack server
  list  )

  Restarting the docker ( neutron_api ) resolves the problem.

  
  Steps to reproduce
  ==================
  Deploy. 
  Wait 4 days. 

  Expected result
  ===============
  Service should remain healthy.. 

  Actual result
  =============
  not healthy ..

  Environment
  ===========
  Rocky , container based.

  
  Logs & Configs
  ==============

  Logs : http://paste.openstack.org/show/738658/

  
  More info: 
  ==========
  Google showed this - 
  https://bugs.launchpad.net/oslo.service/+bug/1547029
  follow by - 
  http://paste.openstack.org/show/487420/

  It seems that if we'll add "eventlet.sleep(0)" in <<<HERE>>> below, it
  might resolve the issue. -

      def run_service(service, done):
          """Service start wrapper.

          :param service: service to run
          :param done: event to wait on until a shutdown is triggered
          :returns: None

          """
          try:
              <<<<< HERE >>>>>>>> 
              service.start()
          except Exception:
              LOG.exception('Error starting thread.')
              raise SystemExit(1)
          else:
              done.wait()

  
  The problem is that I didnt come up with an easy way to reproduce the issue in order to confirm it.

  Any suggestions ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1809823/+subscriptions