← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1877491] Re: cc_grub_dpkg: determine idevs in a more robust manner with grub-probe

 

This bug is believed to be fixed in cloud-init in version 20.3. If this
is still a problem for you, please make a comment and set the state back
to New

Thank you.

** Changed in: cloud-init
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1877491

Title:
  cc_grub_dpkg: determine idevs in a more robust manner with grub-probe

Status in cloud-init:
  Fix Released

Bug description:
  Currently, we populate the debconf database variable grub-
  pc/install_devices by checking to see if a device is present in a
  hardcoded list [1] of directories:

  - /dev/sda
  - /dev/vda
  - /dev/xvda
  - /dev/sda1
  - /dev/vda1
  - /dev/xvda1

  [1] https://github.com/canonical/cloud-
  init/blob/master/cloudinit/config/cc_grub_dpkg.py

  While this is a simple elegant solution, the hardcoded list does not
  match real world conditions, where grub is installed to a disk which
  is not on this list.

  The primary example is any cloud which uses NVMe storage, such as AWS
  c5 instances.

  /dev/nvme0n1 is not on the above list, and in this case, falls back to
  a hardcoded /dev/sda value for grub-pc/install_devices.

  The thing is, the grub postinstall script [2] checks to see if the
  value from grub-pc/install_devices exists, and if it doesn't, shows
  the user an interactive dpkg prompt where they must select the disk to
  install grub to. See the screenshot [3].

  [2] https://paste.ubuntu.com/p/5FChJxbk5K/
  [3] https://launchpadlibrarian.net/478771797/Screenshot%20from%202020-04-14%2014-39-11.png

  This breaks scripts that don't set DEBIAN_FRONTEND=noninteractive as
  they get hung waiting for the user to input a choice.

  I propose that we modify the cc_grub_dpkg module to be more robust at
  selecting the correct disk grub is installed to.

  Why not simply add an extra directory to the hardcoded list?

  Lets take NVMe storage as an example again. On a c5d.large instance I
  spun up just now, lsblk returns:

  $ lsblk
  NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
  nvme0n1     259:0    0 46.6G  0 disk
  nvme1n1     259:1    0    8G  0 disk
  └─nvme1n1p1 259:2    0    8G  0 part /

  We cannot hardcode /dev/nvme0n1, as the NVMe naming conventions are
  not stable in the kernel, and some boots the 8G disk will be
  /dev/nvme0n1, and others will be /dev/nvme1n1.

  Instead, I propose that we determine which grub has been installed to
  by following the grub2 debian/postinst.in script, and implementing the
  algorithm behind usable_partitions(), device_to_id() and
  available_ids() functions [3].

  [3] https://paste.ubuntu.com/p/vKFNSwNyhP/

  This uses grub-probe to find the root disk where the /boot directory
  is located, and then turns the disk name into a /dev/disk/by-id/
  value.

  This is robust to unstable kernel device naming conventions.

  On Nitro, this returns:
  /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0179fff411dd211f0

  On Xen, this returns:
  /dev/xvda

  On a typical QEMU/KVM machine, this returns:
  /dev/vda

  On my personal desktop computer, this returns:
  /dev/disk/by-id/ata-WDC_WD5000AAKX-00PWEA0_WD-WMAYP3497618

  I have tested this on AWS, on Xen, Nitro, on KVM, with BIOS and EFI
  based instances, in LXC, and on bare metal with a BIOS based MAAS
  machine.

  All give the correct results in my testing.

  TESTING:

  You can fetch grub-pc/install_devices with:

  $ echo get grub-pc/install_devices | sudo debconf-communicate grub-pc

  Reset with:

  $ echo reset grub-pc/install_devices | sudo debconf-communicate grub-
  pc

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1877491/+subscriptions


References