← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1840686] Re: Xenial images won't reboot if disk size is > 2TB when using GPT

 

** Also affects: grub2-signed (Ubuntu)
   Importance: Undecided
       Status: New

** Changed in: grub2-signed (Ubuntu)
       Status: New => In Progress

** Changed in: grub2-signed (Ubuntu)
   Importance: Undecided => High

** Changed in: grub2-signed (Ubuntu)
     Assignee: (unassigned) => Eric Desrochers (slashd)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1840686

Title:
  Xenial images won't reboot if disk size is > 2TB when using GPT

Status in cloud-init:
  Won't Fix
Status in grub package in Ubuntu:
  Fix Released
Status in grub2-signed package in Ubuntu:
  Fix Released
Status in grub source package in Xenial:
  In Progress

Bug description:
  [Impact]

  On Xenial images which use GPT instead of MBR to enable efi based
  booting, there is an issue where after booting an instance that has a
  disk size of 2049 GB or higher, we hang on the next subsequent boot
  (Logs indicate it hanging on "Booting Hard Disk 0").

  This is a problem in grub2 where the system would become unbootable
  after ext* online resize if no resize_inode was created at ext* format
  time.

  [Test Case]

  To reproduce:

  1) Create an image with a disk size of 3072 GB using a serial that has
  GPT:

  gcloud compute instances create test-3072-xenial --image daily-
  ubuntu-1604-xenial-v20190731 --image-project ubuntu-os-cloud-devel
  --boot-disk-size 3072

  2) Reboot the instance

  The instance will hang on reboot and you cannot connect. If you go to
  GCP console and select Logs > Serial port 1 (console), you will see
  the boot process has stopped at "Booting Hard Disk 0".

  I have built a test package, which is available here:

  https://launchpad.net/~mruffell/+archive/ubuntu/lp1840686-test

  If you do step 1) but do not reboot, and instead add the PPA, install
  the new grub like so:

  1) gcloud compute instances create test-3072-xenial --image daily-ubuntu-1604-xenial-v20190731 --image-project ubuntu-os-cloud-devel --boot-disk-size 3072
  2) sudo add-apt-repository ppa:mruffell/lp1840686-test
  3) sudo apt-get update
  4) sudo apt remove grub-common grub-efi-amd64 grub-efi-amd64-bin grub-efi-amd64-signed grub-pc-bin grub2-common
  5) sudo apt install grub-common grub-efi-amd64 grub-efi-amd64-bin grub-pc-bin grub2-common
  6) sudo grub-install /dev/sda
  7) sudo reboot

  The instance will boot successfully and you will be able to connect.

  Note, we must use "daily-ubuntu-1604-xenial-v20190731" as the image,
  as it is enabled for GPT and efi. GCP was reverted back to MBR and
  bios booting because of this bug, so the latest images will not
  reproduce the problem.

  [Regression Potential]

  Grub is a core package and every care must be taken in order to not
  introduce any regressions.

  The commit is present in B, D, E and F, and is considered well tested
  and widely adopted by the community.

  The commit comes with its own testcase, to test the ext4_metabg fix.

  The changes are localised to ext* based filesystems, although since
  they are the most popular family of filesystems used by the community,
  this does not reduce risk of breakage by much.

  If a regression were to happen, a regression would have a large
  impact, and in the worst case, can lead to unbootable systems and data
  loss for users who are not technical enough to reinstall grub from a
  working package inside the broken system chroot.

  [Other Info]

  In comment #4, Sultan identifies the fix as:

  commit e20aa39ea4298011ba716087713cff26c6c52006
  Author: Vladimir Serbinenko <phcoder@xxxxxxxxx>
  Date:   Mon Feb 16 20:53:26 2015 +0100
  Subject: ext2: Support META_BG.

  This commit is from upstream grub2, and can be found here:

  https://git.savannah.gnu.org/cgit/grub.git/commit/?id=e20aa39ea4298011ba716087713cff26c6c52006

  Looking at when this was merged:

  $ git describe --contains e20aa39ea4298011ba716087713cff26c6c52006
  2.02-beta3~429

  This commit is present in B, D, E and F, leaving X as the only version
  needing an SRU.

  The commit cleanly cherry picks to X, because the delta from
  2.02~beta2-36ubuntu3.22 to 2.02-beta3~429 is small.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1840686/+subscriptions