yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #62371
[Bug 1657130] Re: get_data in DataSourceOpenStack.py can time out if metadata service is slow
This bug was fixed in the package cloud-init -
0.7.9-48-g1c795b9-0ubuntu1~16.10.1
---------------
cloud-init (0.7.9-48-g1c795b9-0ubuntu1~16.10.1) yakkety; urgency=medium
* debian/rules: install Z99-cloudinit-warnings.sh to /etc/profile.d
* debian/patches/ds-identify-behavior-yakkety.patch: adjust default
behavior of ds-identify for SRU (LP: #1669675, #1660385).
* New upstream snapshot.
- Support warning if the used datasource is not in ds-identify's list
(LP: #1669675).
- DatasourceEc2: add warning message when not on AWS. (LP: #1660385)
- Z99-cloudinit-warnings: Add profile.d script for showing warnings on
- Z99-cloud-locale-test.sh: convert tabs to spaces, remove unneccesary
execute bit in permissions.
- (RedHat) net: correct errors in cloudinit/net/sysconfig.py
[Lars Kellogg-Stedman]
- ec2_utils: fix MetadataLeafDecoder that returned bytes on empty
- Fix eni rendering of multiple IPs per interface [Ryan Harper]
(LP: #1657940)
- Add 3 ecdsa-sha2-nistp* ssh key types now that they are standardized
[Lars Kellogg-Stedman]
- EC2: Do not cache security credentials on disk [Andrew Jorgensen]
(LP: #1638312)
- OpenStack: Use timeout and retries from config in get_data.
[Lars Kellogg-Stedman] (LP: #1657130)
- Fixed Misc issues related to VMware customization. [Sankar Tanguturi]
- (RedHat) Use dnf instead of yum when available [Lars Kellogg-Stedman]
- Get early logging logged, including failures of cmdline url.
- test / doc / build environment changes
- Remove style checking during build and add latest style checks to
tox [Joshua Powers]
- code-style: make master pass pycodestyle (2.3.1) cleanly, currently
[Joshua Powers]
- Fix small typo and change iso-filename for consistency
- tools/mock-meta: support python2 or python3 and ipv6 in both.
- tests: remove executable bit on test_net, so it runs, and fix it.
- tests: No longer monkey patch httpretty for python 3.4.2
- reset httppretty for each test [Lars Kellogg-Stedman]
- build: fix running Make on a branch with tags other than master
- doc: Fix typos and clarify some aspects of the part-handler
[Erik M. Bray]
- doc: add some documentation on OpenStack datasource.
- Fix minor docs typo: perserve > preserve [Jeremy Bicha]
- validate-yaml: use python rather than explicitly python3
-- Scott Moser <smoser@xxxxxxxxxx> Mon, 06 Mar 2017 16:37:28 -0500
** Changed in: cloud-init (Ubuntu Yakkety)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1657130
Title:
get_data in DataSourceOpenStack.py can time out if metadata service is
slow
Status in cloud-init:
Fix Committed
Status in cloud-init package in Ubuntu:
Fix Released
Status in cloud-init source package in Xenial:
Fix Released
Status in cloud-init source package in Yakkety:
Fix Released
Bug description:
=== Begin SRU Template ===
[Impact]
On heavily loaded openstack metadata services, cloud-init may hit a timeout
and not properly retry when waiting longer or retring would allow it to
succeed.
cloud-init contained a setting to configure this but it was not used in all
cases. The change here enabled usage of timeout and retry for.
[Test Case]
1. Launch an instance on openstack.
2. Verify inconsistent use of 'timeout' in /var/log/cloud-init.log
$ grep http://169.254.169.254/openstack /var/log/cloud-init.log | grep 0/ | head -n 2
2017-03-03 16:51:23,824 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'headers': {'User-Agent': 'Cloud-Init/0.7.9'}, 'method': 'GET', 'timeout': 10.0} configuration
2017-03-03 16:51:24,384 - url_helper.py[DEBUG]: [0/6] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'headers': {'User-Agent': 'Cloud-Init/0.7.9'}, 'method': 'GET', 'timeout': 5.0} configuration
3. enable proposed, update, upgrade
4. clean
rm -Rf /var/lib/cloud /var/log/cloud-init*
5. reboot
6. re-check step 2, expect see 'timeout' is consistent.
[Regression Potential]
low chance for regression. Slower boot times but more reliable on a non-perform
ant metadata service.
=== End SRU Template ===
cloud-init sometimes times out and fails to fetch metadata in the
OpenStack environment when the Controller node is under high workload.
The default timeout value is 5 seconds and it may be too small in some
cases where the Controller node is too busy to respond to the metadata
request from the instance in time.
There is a 'timeout' configuration setting, as in...
datasource:
OpenStack:
timeout: 30
...but this value is not used by the get_data method in
cloudinit/sources/DataSourceOpenStack.py, because get_data is called
from cloudinit/sources/__init__.py with no keyword arguments:
LOG.debug("Seeing if we can get any data from %s", cls)
s = cls(sys_cfg, distro, paths)
if s.get_data():
myrep.message = "found %s data from %s" % (mode, name)
return (s, type_utils.obj_name(cls))
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1657130/+subscriptions
References