yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1332058] [NEW] keystone behavior when one memcache backend is down

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Sergii Golovatiuk <sgolovatiuk@xxxxxxxxxxxx>
Date: Thu, 19 Jun 2014 12:19:47 -0000
Reply-to: Bug 1332058 <1332058@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

Hi,

Our implementation uses dogpile.cache.memcached as a backend for tokens.
Recently, I have found interesting behavior when one of memcache regions
went down. There is a 3-6 second delay when I try to get a token. If I
have 2 backends then I have 6-12 seconds delay. It's very easy to test

Test connection using

for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep real
| awk '{print $2}'; done

Block one memcache backend using

iptables -I INPUT -p tcp --dport 11211 -j DROP  (Simulation power outage
of node)

Test the speed using

for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep real
| awk '{print $2}'; done

Also I straced keystone process with

strace -tt -s 512 -o /root/log1 -f -p PID

and got

26872 connect(9, {sa_family=AF_INET, sin_port=htons(11211),
sin_addr=inet_addr("10.108.2.3")}, 16) = -1 EINPROGRESS (Operation now
in progress)

though this IP is down

Also I checked the code

https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L210-L237
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L285-L289
 https://github.com/openstack/keystone/blob/master/keystone/common/kvs/backends/memcached.py#L96

and was not able to find any piece of details how keystone treats with
backend when it's down

There should be a logic which temporarily blocks backend when it's not
accessible. After timeout period, backend should be probed (but not
blocking get/set operations of current backends) and if connection is
successful it should be added back to operation. Here is a sample how it
could be implemented

http://dogpilecache.readthedocs.org/en/latest/usage.html#changing-
backend-behavior

** Affects: keystone
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1332058

Title:
  keystone behavior when one memcache backend is down

Status in OpenStack Identity (Keystone):
  New

Bug description:
  Hi,

  Our implementation uses dogpile.cache.memcached as a backend for
  tokens. Recently, I have found interesting behavior when one of
  memcache regions went down. There is a 3-6 second delay when I try to
  get a token. If I have 2 backends then I have 6-12 seconds delay. It's
  very easy to test

  Test connection using

  for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep
  real | awk '{print $2}'; done

  Block one memcache backend using

  iptables -I INPUT -p tcp --dport 11211 -j DROP  (Simulation power
  outage of node)

  Test the speed using

  for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep
  real | awk '{print $2}'; done

  Also I straced keystone process with

  strace -tt -s 512 -o /root/log1 -f -p PID

  and got

  26872 connect(9, {sa_family=AF_INET, sin_port=htons(11211),
  sin_addr=inet_addr("10.108.2.3")}, 16) = -1 EINPROGRESS (Operation now
  in progress)

  though this IP is down

  Also I checked the code

  https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L210-L237
  https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L285-L289
   https://github.com/openstack/keystone/blob/master/keystone/common/kvs/backends/memcached.py#L96

  and was not able to find any piece of details how keystone treats with
  backend when it's down

  There should be a logic which temporarily blocks backend when it's not
  accessible. After timeout period, backend should be probed (but not
  blocking get/set operations of current backends) and if connection is
  successful it should be added back to operation. Here is a sample how
  it could be implemented

  http://dogpilecache.readthedocs.org/en/latest/usage.html#changing-
  backend-behavior

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1332058/+subscriptions

Follow ups

[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Thierry Carrez, 2014-09-30
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Dolph Mathews, 2014-09-25
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Yuriy Taraday, 2014-09-08
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: OpenStack Infra, 2014-09-05
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Vladimir Kuklin, 2014-08-22
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Bogdan Dobrelya, 2014-08-21
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Bogdan Dobrelya, 2014-08-21
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Dolph Mathews, 2014-08-20
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Bogdan Dobrelya, 2014-08-05
[Bug 1332058] [NEW] keystone behavior when one memcache backend is down
From: Sergii Golovatiuk, 2014-06-19

References

[Bug 1332058] [NEW] keystone behavior when one memcache backend is down
From: Sergii Golovatiuk, 2014-06-19