yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #16397
[Bug 1332058] [NEW] keystone behavior when one memcache backend is down
Public bug reported:
Hi,
Our implementation uses dogpile.cache.memcached as a backend for tokens.
Recently, I have found interesting behavior when one of memcache regions
went down. There is a 3-6 second delay when I try to get a token. If I
have 2 backends then I have 6-12 seconds delay. It's very easy to test
Test connection using
for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep real
| awk '{print $2}'; done
Block one memcache backend using
iptables -I INPUT -p tcp --dport 11211 -j DROP (Simulation power outage
of node)
Test the speed using
for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep real
| awk '{print $2}'; done
Also I straced keystone process with
strace -tt -s 512 -o /root/log1 -f -p PID
and got
26872 connect(9, {sa_family=AF_INET, sin_port=htons(11211),
sin_addr=inet_addr("10.108.2.3")}, 16) = -1 EINPROGRESS (Operation now
in progress)
though this IP is down
Also I checked the code
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L210-L237
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L285-L289
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/backends/memcached.py#L96
and was not able to find any piece of details how keystone treats with
backend when it's down
There should be a logic which temporarily blocks backend when it's not
accessible. After timeout period, backend should be probed (but not
blocking get/set operations of current backends) and if connection is
successful it should be added back to operation. Here is a sample how it
could be implemented
http://dogpilecache.readthedocs.org/en/latest/usage.html#changing-
backend-behavior
** Affects: keystone
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1332058
Title:
keystone behavior when one memcache backend is down
Status in OpenStack Identity (Keystone):
New
Bug description:
Hi,
Our implementation uses dogpile.cache.memcached as a backend for
tokens. Recently, I have found interesting behavior when one of
memcache regions went down. There is a 3-6 second delay when I try to
get a token. If I have 2 backends then I have 6-12 seconds delay. It's
very easy to test
Test connection using
for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep
real | awk '{print $2}'; done
Block one memcache backend using
iptables -I INPUT -p tcp --dport 11211 -j DROP (Simulation power
outage of node)
Test the speed using
for i in {1..20}; do (time keystone token-get >> log2) 2>&1 | grep
real | awk '{print $2}'; done
Also I straced keystone process with
strace -tt -s 512 -o /root/log1 -f -p PID
and got
26872 connect(9, {sa_family=AF_INET, sin_port=htons(11211),
sin_addr=inet_addr("10.108.2.3")}, 16) = -1 EINPROGRESS (Operation now
in progress)
though this IP is down
Also I checked the code
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L210-L237
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/core.py#L285-L289
https://github.com/openstack/keystone/blob/master/keystone/common/kvs/backends/memcached.py#L96
and was not able to find any piece of details how keystone treats with
backend when it's down
There should be a logic which temporarily blocks backend when it's not
accessible. After timeout period, backend should be probed (but not
blocking get/set operations of current backends) and if connection is
successful it should be added back to operation. Here is a sample how
it could be implemented
http://dogpilecache.readthedocs.org/en/latest/usage.html#changing-
backend-behavior
To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1332058/+subscriptions
Follow ups
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Thierry Carrez, 2014-09-30
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Dolph Mathews, 2014-09-25
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Yuriy Taraday, 2014-09-08
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: OpenStack Infra, 2014-09-05
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Vladimir Kuklin, 2014-08-22
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Bogdan Dobrelya, 2014-08-21
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Bogdan Dobrelya, 2014-08-21
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Dolph Mathews, 2014-08-20
-
[Bug 1332058] Re: keystone behavior when one memcache backend is down
From: Bogdan Dobrelya, 2014-08-05
-
[Bug 1332058] [NEW] keystone behavior when one memcache backend is down
From: Sergii Golovatiuk, 2014-06-19
References