← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1988499] [NEW] Snap prevents repartitioning Azure resource disk

 

You have been subscribed to a public bug:

In an Azure VM, the resource disk (a.k.a. “local” or “temp” disk) has a
single partition created by the Azure infrastructure. Linux cloud-init
creates an ext4 file system in that partition and arranges for it to be
mounted on /mnt. In Ubuntu 20.04 and Ubuntu 22.04 images in the Azure
Marketplace, snap then creates a bind mount of /mnt for its internal
purposes.

Some customers want to use the Azure resource disk for purposes other
than a file system mounted on /mnt.  If they unmount the disk, and use a
partition editor to remove or change the partition structure, the
closing ioctl to re-read the partition table fails because the Linux
kernel still has a reference to the disk.  The command “blockdev
--rereadpt” also fails.

After debugging this problem, it turns out that the umount of /mnt only
partially succeeds, and that’s why the ioctl thinks the disk is still in
use.  From what’s visible in the file system, the umount has succeeded.
And “lsblk” shows that /dev/sdb1 (assuming the resource disk is
/dev/sdb) as not mounted anywhere.  But this message:

     [   51.885870] EXT4-fs (sdb1): unmounting filesystem.

is *not* output in dmesg because internally the Linux kernel still has a
reference to the mount that it is waiting (forever) to go away.

The problem is that snap has a reference to the mount, which was created
by “snap-confine” doing the bind mount. This behavior of snap is
specifically for the /mnt mount point (and maybe “/” for the root file
system?):

* If I bugger things up a bit so that cloud-init doesn’t force the
resource disk mount point to be /mnt, and change it to be /mnt2, then
Ubuntu boots normally, and mounts the resource disk on /mnt2.  At that
point, I can umount /mnt2, and the umount is done 100%, including the
“unmounting filesystem” message in dmesg. The ioctl problem in fdisk or
parted goes away commensurately.

* If I remove “snap” entirely from my Ubuntu 20.04 installation, the
problem also goes away.

* The problem does not occur on RHEL 8.5 or CentOS 8.5, which don’t have
snap in the first place.

What’s the right way to solve this problem?  Unfortunately, I’m not
knowledgeable about snap or what snap-confine is trying to do.

* Why is snap tracking /mnt?  Is there a way to tell snap not to track
/mnt?

* Or is there some design flaw in snap that causes the mount on /mnt to
not work normally?

Longer run, we’re looking at enhancing cloud-init with an option to not
mount the resource disk at all, which should avoid the problem.  But
still, there should be a way for the mount of the resource disk on /mnt
to work normally.

** Affects: cloud-init
     Importance: Undecided
         Status: New


** Tags: jammy
-- 
Snap prevents repartitioning Azure resource disk
https://bugs.launchpad.net/bugs/1988499
You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init.