← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1358258] [NEW] Should check if haproxy process alive when collecting state

 

Public bug reported:

Lb agent collects the states of pools in a periodical method
"collect_stats". When one haproxy process crashes, driver.get_stats
returns a dictionary like:

{ members: {}}

so self.plugin_rpc.update_pool_states will be executed without raising
an exception. As a result, the crashed haproxy process won't be
restarted.

def collect_stats(self, context):
    for pool_id, driver_name in self.instance_mapping.items():
        driver = self.device_drivers[driver_name]
        try:
            stats = driver.get_stats(pool_id)
            if stats:
                self.plugin_rpc.update_pool_stats(pool_id, stats)
        except Exception:
            LOG.exception(_('Error updating statistics on pool %s'), pool_id)
            self.needs_resync = True

I think we need to check if the haproxy process is alive in
"collect_stats", and set needs_resync to false when some haproxy
processes crash to restart those processes.

** Affects: neutron
     Importance: Undecided
     Assignee: Zhiyuan Cai (luckyvega-g)
         Status: New

** Changed in: neutron
     Assignee: (unassigned) => Zhiyuan Cai (luckyvega-g)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1358258

Title:
  Should check if haproxy process alive when collecting state

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  Lb agent collects the states of pools in a periodical method
  "collect_stats". When one haproxy process crashes, driver.get_stats
  returns a dictionary like:

  { members: {}}

  so self.plugin_rpc.update_pool_states will be executed without raising
  an exception. As a result, the crashed haproxy process won't be
  restarted.

  def collect_stats(self, context):
      for pool_id, driver_name in self.instance_mapping.items():
          driver = self.device_drivers[driver_name]
          try:
              stats = driver.get_stats(pool_id)
              if stats:
                  self.plugin_rpc.update_pool_stats(pool_id, stats)
          except Exception:
              LOG.exception(_('Error updating statistics on pool %s'), pool_id)
              self.needs_resync = True

  I think we need to check if the haproxy process is alive in
  "collect_stats", and set needs_resync to false when some haproxy
  processes crash to restart those processes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1358258/+subscriptions


Follow ups

References