← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1255594] Re: neutron glue code creates tokens excessively, still

 

** Changed in: nova
       Status: Fix Committed => Fix Released

** Changed in: nova
    Milestone: None => juno-1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1255594

Title:
  neutron glue code creates tokens excessively, still

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  Reusing keystone tokens improves OpenStack efficiency and performance.
  For operations that require a token, reusing a token avoids the
  overhead of a request to keystone. For operations that validate
  tokens, reused tokens improve the hit rate of authentication caches
  (e.g., in keystoneclient.middleware). In both cases, the load on the
  keystone server is reduced, thus improving the response time for
  requests that do require new tokens or token validation. Finally,
  since token validation is so CPU intensive, improved auth cache hit
  rate can significantly reduce CPU utilization by keystone.

  In spite of the progress made by
  http://github.com/openstack/nova/commit/85332012dede96fa6729026c2a90594ea0502ac5,
  which was committed to address bug #1250580, the neutronv2 network API
  code in nova-compute creates more tokens than necessary, to the point
  where performance degradation is measurable when creating a large
  number of instances.

  Prior to the aforementioned change, nova-compute created a new admin
  token for accessing neutron virtually every time a call was made into
  nova.network.neutronv2.  With aforementioned change, a token is
  created once per "thread" (i.e., green thread); thus multiple calls
  into neutronv2 can share a token. For example, during instance
  creation, a single token is created then reused 6 times; prior to the
  patch, 7 tokens would have been created by nova.network.neutronv2 per
  "nova boot". However, this scheme is far from optimal. Given that
  tokens, by default, have a shelf life of 24H, a single token could be
  shared by _all_ nova.network.neutronv2 calls in a 24-hour period.

  The performance impact of sharing a single neutronv2 admin token is
  easy to observe when creating a large number of instances in parallel.
  In this example, I boot 40 instances in parallel, ping them, then
  delete them. I'm using a 24-core machine with enough RAM and disk
  throughput to never become bottlenecks. Note that I'm running with
  multiple keystone-all worker processes
  (https://review.openstack.org/#/c/42967/). Using the per-thread
  tokens, the last instance becomes active after 40s and the last
  instance is deleted after 65s. Using a single shared token, the last
  instance becomes active after 32s and the last instance is deleted
  after 60s. During the token-per-thread run, keystone-all processes had
  900% CPU utilization (i.e., 9 x 100% of a single core) for the first
  ~10s, then stayed in the 50-100% range for the rest of the run. In the
  single token run, the keystone-all processes never exceeded 150% CPU
  utilization.

  I focused on the nova.network.neutronv2 because it created the most
  tokens during my parallel boot experiment. However there are other
  excessive token offenders. After fixing nova.network.neutronv2, the
  leading auth requestors are glance-index and glance-registry due to a
  high auth cache miss rate. I'm not sure who's creating those new
  tokens however.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1255594/+subscriptions