yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #67914
[Bug 1657130] Re: get_data in DataSourceOpenStack.py can time out if metadata service is slow
This bug is believed to be fixed in cloud-init in 17.1. If this is still
a problem for you, please make a comment and set the state back to New
Thank you.
** Changed in: cloud-init
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1657130
Title:
get_data in DataSourceOpenStack.py can time out if metadata service is
slow
Status in cloud-init:
Fix Released
Status in cloud-init package in Ubuntu:
Fix Released
Status in cloud-init source package in Xenial:
Fix Released
Status in cloud-init source package in Yakkety:
Fix Released
Bug description:
=== Begin SRU Template ===
[Impact]
On heavily loaded openstack metadata services, cloud-init may hit a timeout
and not properly retry when waiting longer or retring would allow it to
succeed.
cloud-init contained a setting to configure this but it was not used in all
cases. The change here enabled usage of timeout and retry for.
[Test Case]
1. Launch an instance on openstack.
2. Verify inconsistent use of 'timeout' in /var/log/cloud-init.log
$ grep http://169.254.169.254/openstack /var/log/cloud-init.log | grep 0/ | head -n 2
2017-03-03 16:51:23,824 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'headers': {'User-Agent': 'Cloud-Init/0.7.9'}, 'method': 'GET', 'timeout': 10.0} configuration
2017-03-03 16:51:24,384 - url_helper.py[DEBUG]: [0/6] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'headers': {'User-Agent': 'Cloud-Init/0.7.9'}, 'method': 'GET', 'timeout': 5.0} configuration
3. enable proposed, update, upgrade
4. clean
rm -Rf /var/lib/cloud /var/log/cloud-init*
5. reboot
6. re-check step 2, expect see 'timeout' is consistent.
[Regression Potential]
low chance for regression. Slower boot times but more reliable on a non-perform
ant metadata service.
=== End SRU Template ===
cloud-init sometimes times out and fails to fetch metadata in the
OpenStack environment when the Controller node is under high workload.
The default timeout value is 5 seconds and it may be too small in some
cases where the Controller node is too busy to respond to the metadata
request from the instance in time.
There is a 'timeout' configuration setting, as in...
datasource:
OpenStack:
timeout: 30
...but this value is not used by the get_data method in
cloudinit/sources/DataSourceOpenStack.py, because get_data is called
from cloudinit/sources/__init__.py with no keyword arguments:
LOG.debug("Seeing if we can get any data from %s", cls)
s = cls(sys_cfg, distro, paths)
if s.get_data():
myrep.message = "found %s data from %s" % (mode, name)
return (s, type_utils.obj_name(cls))
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1657130/+subscriptions
References