← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1379476] [NEW] Timeouts to keystone VIP, sporadic issues with keystone, may be caused by haproxy/corosync

 

Public bug reported:

It was brought to my attention the following situation:
--------

ConnectionError: HTTPSConnectionPool(host='10.29.41.136', port=5000):
Max retries exceeded with url: /v2.0/tokens (Caused by <class
'httplib.BadStatusLine'>: '')

Had corosync with all 3 nodes in it, then 2 of them all of a sudden died after a little while.
Now we are up on 1 node keystone cluster.

In the syslog we're able to see that keystone is being signaled to
terminate at points where we see failed connections within the apache
logs. Even though there are 2 nodes within the cluster which are
physically down it may help to remove the nodes from the cluster on the
last surviving node. By turning the cluster into a one node cluster,
we're going to make sure that corosync doesn't worry about those other
nodes anymore. The hope is that this will prevent the keystone service
from being taken down unexpectedly.

--------

What is the deployment recommendation for keystone to be configured
together with pacemaker + corosync ?

Right now users can be using dhclient on top of a bridge interface as
the cluster interconnect, for example. Is this a supported configuration
? Are there any upstream problems related to having dhcp interfaces as
the cluster interconnects ?

** Affects: keystone
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1379476

Title:
  Timeouts to keystone VIP, sporadic issues with keystone, may be caused
  by haproxy/corosync

Status in OpenStack Identity (Keystone):
  New

Bug description:
  It was brought to my attention the following situation:
  --------

  ConnectionError: HTTPSConnectionPool(host='10.29.41.136', port=5000):
  Max retries exceeded with url: /v2.0/tokens (Caused by <class
  'httplib.BadStatusLine'>: '')

  Had corosync with all 3 nodes in it, then 2 of them all of a sudden died after a little while.
  Now we are up on 1 node keystone cluster.

  In the syslog we're able to see that keystone is being signaled to
  terminate at points where we see failed connections within the apache
  logs. Even though there are 2 nodes within the cluster which are
  physically down it may help to remove the nodes from the cluster on
  the last surviving node. By turning the cluster into a one node
  cluster, we're going to make sure that corosync doesn't worry about
  those other nodes anymore. The hope is that this will prevent the
  keystone service from being taken down unexpectedly.

  --------

  What is the deployment recommendation for keystone to be configured
  together with pacemaker + corosync ?

  Right now users can be using dhclient on top of a bridge interface as
  the cluster interconnect, for example. Is this a supported
  configuration ? Are there any upstream problems related to having dhcp
  interfaces as the cluster interconnects ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1379476/+subscriptions


Follow ups

References