← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1715374] Re: Reloading compute with SIGHUP prenvents instances to boot

 

Reviewed:  https://review.openstack.org/596275
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=d37c74d6382690a05ba3ca10edd7f2acb0fbbb2e
Submitter: Zuul
Branch:    master

commit d37c74d6382690a05ba3ca10edd7f2acb0fbbb2e
Author: Bogdan Dobrelya <bdobreli@xxxxxxxxxx>
Date:   Fri Aug 24 12:15:51 2018 +0200

    Fix postrotate to notify holders of rotated logs
    
    Lsof +L1 locates unlinked and open files and does not work for
    logrotate, neither with copyteuncate not w/o that option.
    
    Instead, find *.X (X - number) files held and notify the processes
    owning those to make an apropriate actions and reopen new log files to
    stop writing to the rotated files.
    
    The actions to be taken by such processes are:
    
    * For httpd processes, use USR1 to gracefully reload
    * For neutron-server, restart the container as it cannot process
      HUP signal well (LP bug #1276694, LP bug #1780139).
    * For nova-compute, restart the container as it cannot process
      HUP signal well (LP bug #1276694, LP bug #1715374).
    * For other processes, use HUP to reload
    
    This also fixes the filter to match logfiles ending with *err,
    like rabbitmq startup errors log.
    
    Closes-Bug: #1780139
    Closes-Bug: #1785659
    Closes-Bug: #1715374
    
    Change-Id: I5110426aa26e5fce7ebb4d80d8a2082cbf80519c
    Signed-off-by: Bogdan Dobrelya <bdobreli@xxxxxxxxxx>


** Changed in: tripleo
       Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1715374

Title:
  Reloading compute with SIGHUP prenvents instances to boot

Status in OpenStack Compute (nova):
  Confirmed
Status in tripleo:
  Fix Released

Bug description:
  When trying to boot a new instance at a compute-node, where nova-
  compute received SIGHUP(the SIGHUP is used as a trigger for reloading
  mutable options), it always failed.

    ========== nova/compute/manager.py ==============
      def cancel_all_events(self):
          if self._events is None:
              LOG.debug('Unexpected attempt to cancel events during shutdown.')
              return
          our_events = self._events
          # NOTE(danms): Block new events
          self._events = None                    <--- Set self._events to "None" 
      ...
      =================================================

    This will cause a NovaException when prepare_for_instance_event() was called.
    It's the cause of the failure of network allocation.

      ========== nova/compute/manager.py ==============
      def prepare_for_instance_event(self, instance, event_name):
          ...
          if self._events is None:
              # NOTE(danms): We really should have a more specific error
              # here, but this is what we use for our default error case
              raise exception.NovaException('In shutdown, no new events '
                                            'can be scheduled')
      =================================================

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1715374/+subscriptions


References