sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #02226
[Bug 1879980] Re: Fail to boot with LUKS on top of RAID1 if the array is broken/degraded
John, thanks a lot for your report! Definitely we don't want to delay boots - although I'm happy to hear that eventually it boots. Can you send me logs so I can understand what's going on?
My suggestion is to follow the steps below (as root user):
(0) [optional] Force a log rotation, in order we only capture the relevant/latest data:
logrotate -f /etc/logrotate.conf ;
(1) Add "debug ignore_loglevel" to your kernel command-line (usually
done by editing /etc/default/grub or /etc/default/grub.d/[somefile]);
update grub after editing the conf file (through "update-grub" tool);
(2) Now that'll seem a bit counter-intuitive: reboot the machine, and
all initramfs-tools verbose output will go to a file, *including* the
password requests for LUKS (I'm not sure why this happens, I feel it's
bug but we can live with that for now, to collect your data). So your
system might seem hung - write the password and press ENTER how many
times it's usually asked (when system appears hung) - hopefully you
manage to boot your system, even if takes a while.
(3) Collect 2 files and attach them here please: "/var/log/syslog" and
"/run/initramfs/initramfs.debug".
Hopefully with that I can understand exactly what's causing this weird behavior in your setup.
Cheers,
Guilherme
--
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1879980
Title:
Fail to boot with LUKS on top of RAID1 if the array is broken/degraded
Status in cryptsetup package in Ubuntu:
Fix Released
Status in initramfs-tools package in Ubuntu:
Fix Released
Status in mdadm package in Ubuntu:
Opinion
Status in cryptsetup source package in Xenial:
Won't Fix
Status in initramfs-tools source package in Xenial:
Won't Fix
Status in mdadm source package in Xenial:
Won't Fix
Status in cryptsetup source package in Bionic:
Fix Released
Status in initramfs-tools source package in Bionic:
Fix Released
Status in mdadm source package in Bionic:
Opinion
Status in cryptsetup source package in Focal:
Fix Released
Status in initramfs-tools source package in Focal:
Fix Released
Status in mdadm source package in Focal:
Opinion
Status in cryptsetup source package in Groovy:
Fix Released
Status in initramfs-tools source package in Groovy:
Fix Released
Status in mdadm source package in Groovy:
Opinion
Status in cryptsetup package in Debian:
New
Bug description:
[Impact]
* Considering a setup of a encrypted rootfs on top of md RAID1 device, Ubuntu is currently unable to decrypt the rootfs if the array gets degraded, like for example if one of the array's members gets removed.
* The problem has 2 main aspects: first, cryptsetup initramfs script
attempts to decrypt the array only in the local-top boot stage, and in
case it fails, it gives-up and show user a shell (boot is aborted).
* Second, mdadm initramfs script that assembles degraded arrays
executes later on boot, in the local-block stage. So, in a stacked
setup of encrypted root on top of RAID, if the RAID is degraded,
cryptsetup fails early in the boot, preventing mdadm to assemble the
degraded array.
* The hereby proposed solution has 2 components: first, cryptsetup
script is modified to allow a gentle failure on local-top stage, then
it retries for a while (according to a heuristic based on ROOTDELAY
with minimum of 30 executions) in a later stage (local-block). This
gives time to other initramfs scripts to run, like mdadm in local-
block stage. And this is meant to work this way according to
initramfs-tools documentation (although Ubuntu changed it a bit with
wait-for-root, hence we stopped looping on local-block, see next
bullet).
* Second, initramfs-tools was adjusted - currently, it runs for a
while the mdadm local-block script, in order to assemble the arrays in
a non-degraded mode. We extended this approach to also execute
cryptsetup, in a way that after mdadm ends its execution, we execute
at least once more time cryptsetup. In an ideal world we should loop
on local-block as Debian's initramfs (in a way to remove hardcoded
mdadm/cryptsetup mentions from initramfs-tools code), but this would
be really a big change, non-SRUable probably. I plan to work that for
future Ubuntu releases.
[Test case]
* Install Ubuntu in a Virtual Machine with 2 disks. Use the installer to create a RAID1 volume and an encrypted root on top of it.
* Boot the VM, and use "sgdisk"/"wipefs" to erase the partition table
from one of the RAID members. Reboot and it will fail to mount rootfs
and continue boot process.
* If using the initramfs-toos/cryptsetup patches hereby proposed, the
rootfs can be mounted normally.
[Regression potential]
* There are potential for regressions, since this is a change in 2
boot components. The patches were designed in a way to keep the
regular case working, it changes the failure case which is not
currently working anyway.
* A modification in the behavior of cryptsetup was introduced: right
now, if we fail the password 3 times (the default maximum attempts),
the script doesn't "panic" and drop to a shell immediately; instead it
runs once more (or twice, if mdadm is installed) before failing. This
is a minor change given the benefit of the being able to mount rootfs
in a degraded RAID1 scenario.
* Other potential regressions could show-up as boot problems, but the
change in initramfs-tools specifically is not invasive, it just may
delay boot time a bit, given we now run cryptsetup multiple times on
local-block, with 1 sec delays between executions.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1879980/+subscriptions