← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1597357] [NEW] When keystone is slow to respond: getting user fails

 

Public bug reported:

To test if an user exists we check the keystone db by using

    openstack show user 'foo' ...

If the user doesn't exists then we get an error.  The usual retry of
openstack lib would imply that we wait the full request_timeout to get
this.  This is currently ~170s.  So 170s times the number of user
in the catalog!

To overcome this a the call is wrapped inside a no retry outer
function[1]

The problem is that on very slow platform legit timeout can occur,
this is especially true for CI.  Here is an example of such failure:

    Error: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]:
Could not evaluate: Command: 'openstack ["user", "show", "--format",
"shell", ["admin", "--domain", "default"]]' has been running for more
then 20 seconds (tried 0, for a total of 0 seconds)

>From  http://logs.openstack.org/58/322858/11/check-tripleo/gate-tripleo-
ci-centos-7-ha/7e5b0a6/logs/postci.txt.gz


[1] https://github.com/openstack/puppet-keystone/blob/master/lib/puppet/provider/keystone_user/openstack.rb#L81

** Affects: puppet-keystone
     Importance: High
     Assignee: Sofer Athlan-Guyot (sofer-athlan-guyot)
         Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1597357

Title:
  When keystone is slow to respond: getting user fails

Status in puppet-keystone:
  Confirmed

Bug description:
  To test if an user exists we check the keystone db by using

      openstack show user 'foo' ...

  If the user doesn't exists then we get an error.  The usual retry of
  openstack lib would imply that we wait the full request_timeout to get
  this.  This is currently ~170s.  So 170s times the number of user
  in the catalog!

  To overcome this a the call is wrapped inside a no retry outer
  function[1]

  The problem is that on very slow platform legit timeout can occur,
  this is especially true for CI.  Here is an example of such failure:

      Error: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]:
  Could not evaluate: Command: 'openstack ["user", "show", "--format",
  "shell", ["admin", "--domain", "default"]]' has been running for more
  then 20 seconds (tried 0, for a total of 0 seconds)

  From  http://logs.openstack.org/58/322858/11/check-tripleo/gate-
  tripleo-ci-centos-7-ha/7e5b0a6/logs/postci.txt.gz

  
  [1] https://github.com/openstack/puppet-keystone/blob/master/lib/puppet/provider/keystone_user/openstack.rb#L81

To manage notifications about this bug go to:
https://bugs.launchpad.net/puppet-keystone/+bug/1597357/+subscriptions


Follow ups