← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1276694] Re: Openstack services should support SIGHUP signal

 

** Also affects: sahara
   Importance: Undecided
       Status: New

** Changed in: sahara
       Status: New => Confirmed

** Changed in: sahara
     Assignee: (unassigned) => Andrew Lazarev (alazarev)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1276694

Title:
  Openstack services should support SIGHUP signal

Status in OpenStack Image Registry and Delivery Service (Glance):
  Confirmed
Status in Orchestration API (Heat):
  Fix Released
Status in OpenStack Identity (Keystone):
  Confirmed
Status in OpenStack Compute (Nova):
  Confirmed
Status in The Oslo library incubator:
  Invalid
Status in OpenStack Data Processing (Sahara, ex. Savanna):
  Confirmed

Bug description:
  1)In order to more effectively manage the unlinked and open (lsof +L1)
  log files descriptors w/o restarting the services, SIGHUP signal
  should be accepted by every Openstack service.

  That would allow, e.g. logrotate jobs to gracefully HUP services after
  their log files were rotated. The only option we have for now is to
  force the services restart, quite a poor option from the services
  continuous accessibility PoV.

  Note: according to  http://en.wikipedia.org/wiki/Unix_signal
  SIGHUP
     ... Many daemons will reload their configuration files and reopen their logfiles instead of exiting when receiving this signal.

  Currently Murano and Glance are out of sync with Oslo SIGHUP support.

  There is also the following issue exists for some of the services of OS projects with synced SIGHUP support:
  2)
  heat-api-cfn, heat-api, heat-api-cloudwatch, keystone:  looks like the synced code is never being executed, thus SIGHUP is not supported for them. Here is a simple test scenario:
  2.1) modify <python-path>/site-packages/<foo-service-name>/openstack/common/service.py
  def _sighup_supported():
  +    LOG.warning("SIGHUP is supported: {0}".format(hasattr(signal, 'SIGHUP')))
      return hasattr(signal, 'SIGHUP')
  2.2) restart service foo-service-name and check logs for "SIGHUP is supported", if service  really supports it, the appropriate messages would be present in the logs.
  2.3) issue kill -HUP <foo-service-pid> and check logs for "SIGHUP is supported" and "Caught SIGHUP", if service  really supports it, the appropriate messages would be present in the logs. Besides that, the service should remain started and its main thread PID should not be changed.

  e.g.
  2.a) heat-engine supports HUPing:
  #service openstack-heat-engine restart
  <132>Apr 11 14:03:48 node-3 heat-heat.openstack.common.service WARNING: SIGHUP is supported: True

  2.b)But heat-api don't know how to HUP:
  #service openstack-heat-api restart
  <134>Apr 11 14:06:22 node-3 heat-heat.api INFO: Starting Heat ReST API on 0.0.0.0:8004
  <134>Apr 11 14:06:22 node-3 heat-eventlet.wsgi.server INFO: Starting single process server

  2.c) HUPing heat-engine is OK
  #pid=$(cat /var/run/heat/openstack-heat-engine.pid); kill -HUP $pid && echo $pid
  16512
  <134>Apr 11 14:12:15 node-3 heat-heat.openstack.common.service INFO: Caught SIGHUP, exiting
  <132>Apr 11 14:12:15 node-3 heat-heat.openstack.common.service WARNING: SIGHUP is supported: True
  <134>Apr 11 14:12:15 node-3 heat-heat.openstack.common.rpc.common INFO: Connected to AMQP server on ...
  service openstack-heat-engine status
  openstack-heat-engine (pid  16512) is running...

  2.d) HUPed heat-api is dead now ;(
  #kill -HUP $(cat /var/run/heat/openstack-heat-api.pid)
  (no new logs)
  # service openstack-heat-api status
  openstack-heat-api dead but pid file exists

  3)
  nova-cert, nova-novncproxy, nova-objectstore, nova-consoleauth, nova-scheduler - unlike to case 2, after kill -HUP <foo-service-pid> command was issued, there would be a "Caught SIGHUP" message in the logs, BUT the associated service would have got dead anyway. Instead, the service should remain started and its main thread PID should not be changed (similar to the 2.c case).

  So, looks like there are a lot of things still should be done to
  ensure POSIX standards abidance in Openstack :-)

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1276694/+subscriptions