yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #29446
[Bug 1255594] Re: neutron glue code creates tokens excessively, still
** Changed in: nova/icehouse
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1255594
Title:
neutron glue code creates tokens excessively, still
Status in OpenStack Compute (Nova):
Fix Released
Status in OpenStack Compute (nova) icehouse series:
Fix Released
Bug description:
Reusing keystone tokens improves OpenStack efficiency and performance.
For operations that require a token, reusing a token avoids the
overhead of a request to keystone. For operations that validate
tokens, reused tokens improve the hit rate of authentication caches
(e.g., in keystoneclient.middleware). In both cases, the load on the
keystone server is reduced, thus improving the response time for
requests that do require new tokens or token validation. Finally,
since token validation is so CPU intensive, improved auth cache hit
rate can significantly reduce CPU utilization by keystone.
In spite of the progress made by
http://github.com/openstack/nova/commit/85332012dede96fa6729026c2a90594ea0502ac5,
which was committed to address bug #1250580, the neutronv2 network API
code in nova-compute creates more tokens than necessary, to the point
where performance degradation is measurable when creating a large
number of instances.
Prior to the aforementioned change, nova-compute created a new admin
token for accessing neutron virtually every time a call was made into
nova.network.neutronv2. With aforementioned change, a token is
created once per "thread" (i.e., green thread); thus multiple calls
into neutronv2 can share a token. For example, during instance
creation, a single token is created then reused 6 times; prior to the
patch, 7 tokens would have been created by nova.network.neutronv2 per
"nova boot". However, this scheme is far from optimal. Given that
tokens, by default, have a shelf life of 24H, a single token could be
shared by _all_ nova.network.neutronv2 calls in a 24-hour period.
The performance impact of sharing a single neutronv2 admin token is
easy to observe when creating a large number of instances in parallel.
In this example, I boot 40 instances in parallel, ping them, then
delete them. I'm using a 24-core machine with enough RAM and disk
throughput to never become bottlenecks. Note that I'm running with
multiple keystone-all worker processes
(https://review.openstack.org/#/c/42967/). Using the per-thread
tokens, the last instance becomes active after 40s and the last
instance is deleted after 65s. Using a single shared token, the last
instance becomes active after 32s and the last instance is deleted
after 60s. During the token-per-thread run, keystone-all processes had
900% CPU utilization (i.e., 9 x 100% of a single core) for the first
~10s, then stayed in the 50-100% range for the rest of the run. In the
single token run, the keystone-all processes never exceeded 150% CPU
utilization.
I focused on the nova.network.neutronv2 because it created the most
tokens during my parallel boot experiment. However there are other
excessive token offenders. After fixing nova.network.neutronv2, the
leading auth requestors are glance-index and glance-registry due to a
high auth cache miss rate. I'm not sure who's creating those new
tokens however.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1255594/+subscriptions