← Back to team overview

debcrafters-packages team mailing list archive

[Bug 2044104] Re: [UBUNTU 20.04] chzdev -e is rebuilding initramfs even with zdev:early=0 set

 

Hello bugproxy, or anyone else affected,

Accepted s390-tools into noble-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/s390-tools/2.31.0-0ubuntu5.2 in a
few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
noble to verification-done-noble. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-noble. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: s390-tools (Ubuntu Noble)
       Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-noble

-- 
You received this bug notification because you are a member of
Debcrafters packages, which is subscribed to s390-tools in Ubuntu.
https://bugs.launchpad.net/bugs/2044104

Title:
  [UBUNTU 20.04] chzdev -e is rebuilding initramfs even with
  zdev:early=0 set

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in s390-tools package in Ubuntu:
  Fix Released
Status in s390-tools-signed package in Ubuntu:
  Fix Released
Status in systemd package in Ubuntu:
  Fix Released
Status in s390-tools source package in Noble:
  Fix Committed
Status in s390-tools-signed source package in Noble:
  Fix Committed
Status in systemd source package in Noble:
  Triaged
Status in s390-tools source package in Oracular:
  Fix Released
Status in systemd source package in Oracular:
  Fix Released
Status in s390-tools source package in Plucky:
  Fix Released
Status in s390-tools-signed source package in Plucky:
  Fix Released
Status in systemd source package in Plucky:
  Fix Released
Status in s390-tools source package in Questing:
  Fix Released
Status in s390-tools-signed source package in Questing:
  Fix Released
Status in systemd source package in Questing:
  Fix Released

Bug description:
  SRU Justification:

  [ Impact ]

   * The CCW (or zdev) devices that are special for s390x always need
     to be explicitly enabled before they can be used.
     And this is usually done with the help of the 'chzdev -e' command
     (part of the s390-tools), that also creates underlying udev rules
     for the device activation.

   * When for example a qeth network device is persistently configured by
     'chzdev -e' the initramfs is usually rebuild, since the corresponding
     udev rule might be needed in initramfs (early at boot time).

   * But chzdev also has the parameter 'zdev:early' that allows to explicity
     direct if the initramfs should be rebuild and the udev rule integrated
     (zdev:early=1) or not (zdev:early=0).

   * Right now the initramfs is erroneously rebuild every time and
     includes all (zdev) udev rules, just ignoring the zdev:early parameter.

   * This can have a significant impact especially on systems with hundreds
     (or even thousands) of devices and can lead to space constraints.
     (Note that also larger ranges of devices can easily be enables with one cmd.)

   * For example supplemental devices (like disks), that are not relevant
     at early boot time (and for example may only be used in backup or take-over
     cases) must not be always activated at boot time.

   * On system in DPM mode this can also be (to a certain degree) controlled
     by the HMC (but only in DPM mode).

   * To fix this situation, two things are needed:
     - s390-tools: have an option in chzdev that allows to identify if an udev rule is
       zdev related or not.
       (since a udev rule could also be generic, and not specific to s390x).
     - systemd: handle zdev related rules in extra/initramfs-tools/hooks/udev
       properly according to the zdev:early parameter or use a proper default

  [ Test Plan ]

   * Have an s390x LPAR or z/VM system with several ccw/zdev devices.
     Some might be in use (e.g. for the underlying disk of the main
     network device), but some spares to test with are needed.
     Easiest is to use qeth network devices.

   * List the available devices:
     $ lszdev | grep qeth | head -3
     qeth         0.0.c000:0.0.c001:0.0.c002                      yes  yes   encc000
     qeth         0.0.c003:0.0.c004:0.0.c005                      no   no
     qeth         0.0.c006:0.0.c007:0.0.c008                      no   no
     Notice that using only a short form of the device triples is sufficient.
     Here c000 is already active, but c003 and c006 are not.

   * Check which qeth devices are in the current initramfs:
     lsinitramfs /boot/initrd.img-$(uname -r) | grep usr/lib/udev/rules.d/41-qeth-0.0.c00
     usr/lib/udev/rules.d/41-qeth-0.0.c000.rules
     Like expected, only c000 is listed.

   * Now add a second device and explicity direct to not include it
     into the initramfs (using parameter 'zdev:early=0'):
     $ sudo chzdev -e qeth 0.0.c003 zdev:early=0
     and check what's in the initramfs:
     $ lsinitramfs /boot/initrd.img-$(uname -r) | grep usr/lib/udev/rules.d/41-qeth-0.0.c00
     usr/lib/udev/rules.d/41-qeth-0.0.c000.rules
     Still c000 only.

   * Now add another device, but this time explicitly directing
     to incl. the corresponding udev rule into initramfs:
     $ sudo chzdev -e qeth 0.0.c006 zdev:early=1
     and check again the content of the initramfs:
     $ lsinitramfs /boot/initrd.img-(uname -r) | grep usr/lib/udev/rules.d/41-qeth-0.0.c00
     usr/lib/udev/rules.d/41-qeth-0.0.c000.rules
     usr/lib/udev/rules.d/41-qeth-0.0.c006.rules
     Now both are included.

   * More for regression testing disable/remove the devices again
     ('zdev:early' parameter is irrelevant in this case):
        sudo chzdev -d qeth c003
        sude chzdev -d qeth c006
        check:
        $ lsinitramfs /boot/initrd.img-(uname -r) | grep -i usr/lib/udev/rules.d/41-qeth-0.0.c00
        usr/lib/udev/rules.d/41-qeth-0.0.c000.rules

   * Add a device without parameter 'zdev:early' specified at all,
     which needs to default to 'zdev:early=1':
     sudo chzdev -e qeth c003
     and check:
     $ lsinitramfs /boot/initrd.img-(uname -r) | grep -i usr/lib/udev/rules.d/41-qeth-0.0.c00
     usr/lib/udev/rules.d/41-qeth-0.0.c000.rules
     usr/lib/udev/rules.d/41-qeth-0.0.c003.rules

   * The primary use case for the new chzdev option '--is-owner'
     was for the    scripted udev rule handling, however,
     the option can also be more directly tested, but the result
     needs to be checked based on the the given return code:

     - this is a standard udev rule of a ccw qeth network device
       (zdev) and is with that always created by chzdev:
       $ ls /etc/udev/rules.d/41-qeth-0.0.0600.rules
       /etc/udev/rules.d/41-qeth-0.0.0600.rules
     - hence 'chzdev --is-owner' succeeds and returns exit code '0'
       (EXIT_OK in exit_code.h):
       $ chzdev --is-owner /etc/udev/rules.d/41-qeth-0.0.0600.rules
       $ echo $?
       0
     - however, the udev rule for snapd was obviously not added by chzdev:,
       $ ls /etc/udev/rules.d/70-snap.snapd.rules
       /etc/udev/rules.d/70-snap.snapd.rules
     - hence the return code here is the expected '33'
       (the newly introduced exit code 'EXIT_UNKNOWN_FILE' in exit_code.h)
       $ chzdev --is-owner /etc/udev/rules.d/70-snap.snapd.rules
       $ echo $?
       33

   * chzdev is especially used at install time, hence another test
     would be to start an installation, and at the initial subiquity
     screen immediately navigate the to installer shell and update
     the s390-tools to the updated version, leave the installer shell
     and proceed with the installation.
     The installation will then run through the usual zDev activation
     screen (using the updated s390-tools), which makes use of chzdev.

  [ Where problems could occur ]

   * The modification in the s390-tools are to expand the chzdev command with
     the option '--is-owner <rule>' that allows to identify a zdev rule.

   * Since it's added (no existing code line was removed or modified) the impact
     is moderate, because it is obviously not in use yet by anyone using noble.

   * However, the code for this option got inserted into existing, hence in case
     the new lines are not properly closed/terminated problem can occur
     that can even have an impact on other chzdev arguments and paramaters
     (e.g. the ones that are in the case stmt before and after 'is-owner').

   * The exit code EXIT_UNKNOWN_FILE of 'is-owner' is 33, whereas the defined
     number could be wrong, used accidentially multiple times or a different
     exit code is expected, which may lead to wrong states.

   * The upstream commit needed to be modifed in one aspect (to backport it to 2.31).
     Between version 2.33, where 'is-owner' was added and the version in noble
     (2.31) a refactoring happend (commit 4c2bfb1d47e7), that led (amongst other,
     not relevant changes) to a the renaming of the file zdev/include/site.h to
     zdev/include/zdev.h.
     Fortunately the content of the file stayed the same, so that no add.
     commit needed to be applied, but only the file name in the quilt patch adjusted.

   * Some modification are for the man page and usage.txt file only.

   * For systemd / extra/initramfs-tools/hooks/udev the modification of this
     LP#2044104 are not sofficient, since after all this was introduced into oracular
     two more cases occured that needed to be handled on top, that are:
     - ensure rules file exists before invoking chzdev (LP: #2079993)
     - udev rules are copied in case zdev_early is not specified (LP: #2102236)

   * To simplyfy the systemd modifications (and with that reduce risk) the
     version check of the initial modfification that checks for the s390-tools
     version (2.33, to ensure that chzdev '--is-owner' is only used if the right
     s390-tools package is available) got removed, since this is now backported
     to previous version 2.31 (hence would fail).
     And because noble will never get a new version anymore, the check is obsolete.

   * All this affects the s390x architecture only.

   * (We may think of removing it from the current development release as well
      that comes today with v2.38.0, since we will never go back to an older
      s390-tools version.)

  [ Other Info ]

   * The systemd / udev changes will be piggy-baged on a bigger systemd update
     (to avoid too many updates, since this affects s390x-only but would trigger
      updates for other architectures too.)

   * The s390-tools and systemd modifications can be done separately,
     in case s390-tools has landed in the archive before the systemd
     modifications, since systemd will be the first exploiter of the
     s390-tools modification.
     Hence a grouped upload is not needed, if s390-tools is handled first.

   * A test build in PPA is available here:
     https://launchpad.net/~fheimes/+archive/ubuntu/lp2103414+lp2078347+lp2044104
     and the test packages were tested:
     https://pastebin.canonical.com/p/nfGDnHVYWd/
  __________

  Versions:
  Ubuntu 20.04.5 s390-tools version 2.12.0-0ubuntu3.7.s390x
  Ubuntu 22.04.2 s390-tools version 2.20.0-0ubuntu3.2.s390x

  When I configure a zfcp LUN persistently via chzdev, the initrd is
  being rebuilt even with parameter zdev:early=0

  root@a8315003:~# chzdev -e zfcp-lun 0.0.1803:0x500507630910d430:0x4019409200000000 zdev:early=0
  zFCP LUN 0.0.1803:0x500507630910d430:0x4019409200000000 configured
  Note: The initial RAM-disk must be updated for these changes to take effect:
         - zFCP LUN 0.0.1803:0x500507630910d430:0x4019409200000000
  update-initramfs: Generating /boot/initrd.img-5.15.0-60-generic
  I: The initramfs will attempt to resume from /dev/dasdb1
  I: (UUID=e70ecb80-4d1e-4074-9cda-ce231ad6e698)
  I: Set the RESUME variable to override this.
  Using config file '/etc/zipl.conf'
  Building bootmap in '/boot'
  Adding IPL section 'ubuntu' (default)
  Preparing boot device: dasda (c00a).
  Done.
  root@a8315003:~#

  == Comment: - Thorsten Diehl <thorsten.diehl@xxxxxxxxxx> - 2023-03-01 06:55:47 ==
  @BOE-dev
  This behaviour is unexpected.
  https://www.ibm.com/docs/en/linux-on-systems?topic=commands-chzdev says:
  Activating a device early during the boot process

  Use the zdev:early device attribute to activate a device early during
  the boot process and to override any existing auto-configuration with
  a persistent device configuration.

  zdev:early=1
      The device is activated during the initial RAM disc phase according to the persistent configuration.

  zdev:early=0
      The device is activated as usual during the boot process. This is the default. If auto-configuration data is present, the device is activated during the initial RAM disc phase according to the auto-configuration.

  I can't interprete a SCSI LUN as a device with auto configuration
  data. (At least, if the zfcp device hasn't NPIV enabled)

  == Comment: #5 - Peter Oberparleiter <Peter.Oberparleiter@xxxxxxxxxx> - 2023-03-01 11:18:28 ==
  (In reply to comment #2)
  > @BOE-dev
  > This behaviour is unexpected.
  > https://www.ibm.com/docs/en/linux-on-systems?topic=commands-chzdev says:
  > Activating a device early during the boot process
  >
  > Use the zdev:early device attribute to activate a device early during the
  > boot process and to override any existing auto-configuration with a
  > persistent device configuration.
  >
  > zdev:early=1
  >     The device is activated during the initial RAM disc phase according to
  > the persistent configuration.
  >
  > zdev:early=0
  >     The device is activated as usual during the boot process. This is the
  > default. If auto-configuration data is present, the device is activated
  > during the initial RAM disc phase according to the auto-configuration.

  The documentation is incorrect for Ubuntu. Canonical specifically
  builds zdev in a way that every change to persistent device
  configuration causes an update to the initial RAM-disk. See also:

  https://bugzilla.linux.ibm.com/show_bug.cgi?id=187578#c35
  https://github.com/ibm-s390-linux/s390-tools/commit/7dd03eaeecdd0e2674f145aca34be1275d291bd8

  > I can't interprete a SCSI LUN as a device with auto configuration data. (At
  > least, if the zfcp device hasn't NPIV enabled)

  This is related to auto-configuration as implemented for DPM.

  == Comment: #6 - Thorsten Diehl <thorsten.diehl@xxxxxxxxxx> - 2023-03-03 12:41:44 ==
  So, IIUC, chzdev is built for Ubuntu with ZDEV_ALWAYS_UPDATE_INITRD=1, which make the parameter zdev:early=0 ineffective. Correct?
  If you confirm, you may also close this bug.

  Not nice - then we have to find an alternate solution.

  == Comment: #7 - Peter Oberparleiter <Peter.Oberparleiter@xxxxxxxxxx> - 2023-03-07 06:48:07 ==
  (In reply to comment #6)
  > So, IIUC, chzdev is built for Ubuntu with ZDEV_ALWAYS_UPDATE_INITRD=1, which
  > make the parameter zdev:early=0 ineffective. Correct?
  > If you confirm, you may also close this bug.
  >
  > Not nice - then we have to find an alternate solution.

  chzdev -p on Ubuntu will by default rebuild the initrd. This is intended
  behavior by Canonical and controlled by the ZDEV_ALWAYS_UPDATE_INITRD build-time
  switch. You can suppress it by adding option --no-root-update to the command
  line.

  Specifying zdev:early=0 to chzdev has exactly the effect that it is supposed to
  have: it tells zdev not to enable that device during initrd processing,
  resulting in the corresponding udev rule not being copied to the initrd [1].

  Unfortunately there is another Ubuntu-initrd script [2] that simply copies ALL
  udev rules, including those created by zdev, into the initrd. As a result,
  zdev's early-attribute handling is rendered useless and all devices are enabled,
  even if a user specified zdev:early=0.

  Since this bug report indicates that there is a use-case for this function in
  Ubuntu, it might be worth asking Canonical if current processing could be
  changed to provide a way for users to specify that a device should specifically
  NOT be enabled within initrd processing.

  Technically this could easily be done:

  1) Have the generic udev initramfs script not copy zdev-generated Udev rules,
     OR
     have the zdev initramfs script remove those rules (somewhat of a hack)

  2) Change the zdev initramfs script logic from the current:

     - enable devices required for the root file system, AND
     - enable devices for which zdev:early=1 was specified

     to

     - enable all persistently configured devices EXCEPT those for which
       zdev:early=0 was specified

     This change would be needed to maintain Canonical's policy of enabling
     all devices in the initrd by default

  I'm open to adding the change in 2) to our s390-tools package, but someone at
  Canonical would need to work out a way to implement 1).

  [1] https://github.com/ibm-s390-linux/s390-tools/blob/master/zdev/initramfs/hooks/s390-tools-zdev#L47
  [2] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/tree/debian/extra/initramfs-tools/hooks/udev#n42

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2044104/+subscriptions