yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1332058] Re: keystone behavior when one memcache backend is down

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Bogdan Dobrelya <bdobrelia@xxxxxxxxxxxx>
Date: Tue, 05 Aug 2014 09:52:36 -0000
Reply-to: Bug 1332058 <1332058@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Related bug in MOS https://bugs.launchpad.net/fuel/+bug/1340657

** Also affects: mos
   Importance: Undecided
       Status: New

** Changed in: mos
    Milestone: None => 5.1

** Changed in: mos
   Importance: Undecided => High

** Changed in: mos
     Assignee: (unassigned) => MOS Keystone (mos-keystone)

** Changed in: mos
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1332058

Title:
  keystone behavior when one memcache backend is down

Status in OpenStack Identity (Keystone):
  Confirmed
Status in Mirantis OpenStack:
  Confirmed

Bug description:
  Hi,

  Our implementation uses dogpile.cache.memcached as a backend for
  tokens. Recently, I have found interesting behavior when one of
  memcache regions went down. There is a 3-6 second delay when I try to
  get a token. If I have 2 backends then I have 6-12 seconds delay. It's
  very easy to test

  Test connection using

  for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep
  real | awk '{print $2}'; done

  Block one memcache backend using

  iptables -I INPUT -p tcp --dport 11211 -j DROP  (Simulation power
  outage of node)

  Test the speed using

  for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep
  real | awk '{print $2}'; done

  Also I straced keystone process with

  strace -tt -s 512 -o /root/log1 -f -p PID

  and got

  26872 connect(9, {sa_family=AF_INET, sin_port=htons(11211),
  sin_addr=inet_addr("10.108.2.3")}, 16) = -1 EINPROGRESS (Operation now
  in progress)

  though this IP is down

  Also I checked the code

  https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L210-L237
  https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L285-L289
   https://github.com/openstack/keystone/blob/master/keystone/common/kvs/backends/memcached.py#L96

  and was not able to find any piece of details how keystone treats with
  backend when it's down

  There should be a logic which temporarily blocks backend when it's not
  accessible. After timeout period, backend should be probed (but not
  blocking get/set operations of current backends) and if connection is
  successful it should be added back to operation. Here is a sample how
  it could be implemented

  http://dogpilecache.readthedocs.org/en/latest/usage.html#changing-
  backend-behavior

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1332058/+subscriptions

References

[Bug 1332058] [NEW] keystone behavior when one memcache backend is down
From: Sergii Golovatiuk, 2014-06-19