← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1835584] [NEW] cloud-init might incorrectly consider reboot as new-instance during kernel upgrade or downgrade

 

Public bug reported:

Between 4.14 kernel and 4.15 kernel this below commit changed the
product uuid of a VM from uppercase to lowercase. Data Sources that use
this number to represent instance-id (e.g., Azure) will go through new-
instance code path at reboot following a kernel upgrade/downgrade (that
is affected by the change). This is problematic for customers who
provision with password on Azure because the password is not saved on
disk new-instance provisioning will disables password access to VM in
that case

Commit:
https://github.com/torvalds/linux/commit/712ff25450bd01366301eef81c33e865d901e7b7#diff-f2bd14bc67b5e2da67116bca971bbd0b

Repro Steps:
Deploy a 18.04-LTS latest VM on Azure (kernel version is 4.18.0-1023-Azure as of July 5th 2019).
Downgrade the kernel to 4.14.119 (using the .deb packages here https://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.119/)
Configure grub to boot into the 4.14 kernel.
Observe in cloud-init log that new instance first boot is happening

In my VM I can see the product uuid changed the casing
4.18 kernel:
$ cat /sys/devices/virtual/dmi/id/product_uuid
1fd1b593-e79e-724c-9b33-d8634642d5f5
$ ls /var/lib/cloud/instances
1fd1b593-e79e-724c-9b33-d8634642d5f5

After downgrade:
$ cat /sys/devices/virtual/dmi/id/product_uuid
1FD1B593-E79E-724C-9B33-D8634642D5F5
$ ls /var/lib/cloud/instances
1FD1B593-E79E-724C-9B33-D8634642D5F5  1fd1b593-e79e-724c-9b33-d8634642d5f5


DataSourceAzure.py is already using instance_id_matches_system_uuid, which converts the uuid to lowercase, to compare instance-id

def check_instance_id(self, sys_cfg):
        # quickly (local check only) if self.instance_id is still valid
        return sources.instance_id_matches_system_uuid(self.get_instance_id())

However, the issue lies in stages.py's is_new_instance() method, which
does not convert uuid to lowercase before comparison, which results in
is_new_instance returning True when it should be False. This affects
methods like apply_network_config, setup, activate, etc...

** Affects: cloud-init
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1835584

Title:
  cloud-init might incorrectly consider reboot as new-instance during
  kernel upgrade or downgrade

Status in cloud-init:
  New

Bug description:
  Between 4.14 kernel and 4.15 kernel this below commit changed the
  product uuid of a VM from uppercase to lowercase. Data Sources that
  use this number to represent instance-id (e.g., Azure) will go through
  new-instance code path at reboot following a kernel upgrade/downgrade
  (that is affected by the change). This is problematic for customers
  who provision with password on Azure because the password is not saved
  on disk new-instance provisioning will disables password access to VM
  in that case

  Commit:
  https://github.com/torvalds/linux/commit/712ff25450bd01366301eef81c33e865d901e7b7#diff-f2bd14bc67b5e2da67116bca971bbd0b

  Repro Steps:
  Deploy a 18.04-LTS latest VM on Azure (kernel version is 4.18.0-1023-Azure as of July 5th 2019).
  Downgrade the kernel to 4.14.119 (using the .deb packages here https://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.119/)
  Configure grub to boot into the 4.14 kernel.
  Observe in cloud-init log that new instance first boot is happening

  In my VM I can see the product uuid changed the casing
  4.18 kernel:
  $ cat /sys/devices/virtual/dmi/id/product_uuid
  1fd1b593-e79e-724c-9b33-d8634642d5f5
  $ ls /var/lib/cloud/instances
  1fd1b593-e79e-724c-9b33-d8634642d5f5

  After downgrade:
  $ cat /sys/devices/virtual/dmi/id/product_uuid
  1FD1B593-E79E-724C-9B33-D8634642D5F5
  $ ls /var/lib/cloud/instances
  1FD1B593-E79E-724C-9B33-D8634642D5F5  1fd1b593-e79e-724c-9b33-d8634642d5f5

  
  DataSourceAzure.py is already using instance_id_matches_system_uuid, which converts the uuid to lowercase, to compare instance-id

  def check_instance_id(self, sys_cfg):
          # quickly (local check only) if self.instance_id is still valid
          return sources.instance_id_matches_system_uuid(self.get_instance_id())

  However, the issue lies in stages.py's is_new_instance() method, which
  does not convert uuid to lowercase before comparison, which results in
  is_new_instance returning True when it should be False. This affects
  methods like apply_network_config, setup, activate, etc...

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1835584/+subscriptions


Follow ups