← Back to team overview

cloud-init-dev team mailing list archive

[Merge] ~dojordan/cloud-init:azuretimeouts into cloud-init:master

 

Douglas Jordan has proposed merging ~dojordan/cloud-init:azuretimeouts into cloud-init:master.

Requested reviews:
  cloud-init commiters (cloud-init-dev)

For more details, see:
https://code.launchpad.net/~dojordan/cloud-init/+git/cloud-init/+merge/340546

During Azure preprovisioning, the platform owned VM gets into a polling loop while we are waiting for the reprovsioning data file (the customer's ovf_env.xml). If the VM moves from one vnet to another, we will get a timeout exception on the request to IMDS and thus need to re-dhcp to get the new IP and network configuration. Currently, that timeout is 60seconds, and upon further testing IMDS responds within 5-20 milliseconds so we propose moving the timeout to 1 second in order to speed up the total deployment time for the customer.

In addition, due to IMDS server updates there is a chance it will be unreachable. If it is, then we could be re-dhcping multiple times, and it is possible to hit the current limit (5) especially with a max retry of 5. That would put the VM in an effectively useless state where it would not be possible to recover. We propose removing the max dhcp retry count for preprovisioning.
-- 
Your team cloud-init commiters is requested to review the proposed merge of ~dojordan/cloud-init:azuretimeouts into cloud-init:master.
diff --git a/cloudinit/sources/DataSourceAzure.py b/cloudinit/sources/DataSourceAzure.py
index 4bcbf3a..0e8fd65 100644
--- a/cloudinit/sources/DataSourceAzure.py
+++ b/cloudinit/sources/DataSourceAzure.py
@@ -49,7 +49,6 @@ DEFAULT_FS = 'ext4'
 AZURE_CHASSIS_ASSET_TAG = '7783-7084-3265-9085-8269-3286-77'
 REPROVISION_MARKER_FILE = "/var/lib/cloud/data/poll_imds"
 IMDS_URL = "http://169.254.169.254/metadata/reprovisiondata";
-IMDS_RETRIES = 5
 
 
 def find_storvscid_from_sysctl_pnpinfo(sysctl_out, deviceid):
@@ -463,21 +462,20 @@ class DataSourceAzure(sources.DataSource):
             raise exception
 
         need_report = report_ready
-        for i in range(IMDS_RETRIES):
+        while True:
             try:
                 with EphemeralDHCPv4() as lease:
                     if need_report:
                         self._report_ready(lease=lease)
                         need_report = False
-                    wait_for_url([url], max_wait=None, timeout=60,
+                    wait_for_url([url], max_wait=None, timeout=1,
                                  status_cb=LOG.info,
                                  headers_cb=lambda url: headers, sleep_time=1,
                                  exception_cb=exception_cb,
                                  sleep_time_cb=sleep_cb)
                     return str(readurl(url, headers=headers))
             except Exception:
-                LOG.debug("Exception during polling-retrying dhcp" +
-                          " %d more time(s).", (IMDS_RETRIES - i),
+                LOG.debug("Exception during polling-retrying dhcp",
                           exc_info=True)
 
     def _report_ready(self, lease):
diff --git a/tests/unittests/test_datasource/test_azure.py b/tests/unittests/test_datasource/test_azure.py
index 254e987..a9e53ac 100644
--- a/tests/unittests/test_datasource/test_azure.py
+++ b/tests/unittests/test_datasource/test_azure.py
@@ -1170,7 +1170,7 @@ class TestAzureDataSourcePreprovisioning(CiTestCase):
                                     headers={'Metadata': 'true',
                                              'User-Agent':
                                              'Cloud-Init/%s' % vs()
-                                             }, method='GET', timeout=60.0,
+                                             }, method='GET', timeout=1,
                                     url=full_url),
                           mock.call(allow_redirects=True,
                                     headers={'Metadata': 'true',
@@ -1212,7 +1212,7 @@ class TestAzureDataSourcePreprovisioning(CiTestCase):
                                     headers={'Metadata': 'true',
                                              'User-Agent':
                                              'Cloud-Init/%s' % vs()},
-                                    method='GET', timeout=60.0, url=full_url),
+                                    method='GET', timeout=1, url=full_url),
                           mock.call(allow_redirects=True,
                                     headers={'Metadata': 'true',
                                              'User-Agent':

Follow ups