yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #76546
[Bug 1810859] [NEW] ds-identify runs too early
Public bug reported:
ds-identify is executed from a systemd generator [1]. Based on my
understanding of the intention of both this creates a non resolvable
timing conflict.
Generators run very early in the boot process.
The cloud-init generator runs ds-identify which in turn runs "blkid" to
find filesystems with specific labels, "cidata" for the nocloud data
source. However, it is possible to construct an environment where the
filesystem with the "cidata" label is on an attached device and the
generator runs prior to the attached device being known to the kernel
and thus the return of blkid cannot reflect the proper status, meaning
the "cidata" label cannot be found and thus the "nocloud" data source is
not properly identified. This implies that the cloud-init.target unit
will be disabled.
Observed in a test environment with qemu and the data source on a
separate virtual device.
According to [1] we shouldn't add any sync points such as "udevadm
settle", thus I am not certain how this could be resolved. Also given
that we cannot control the timing of the execution of the generator it
appears that this is going to be difficult to get under control.
Would it make sense to give ds-identify the option to simply exit and
leave things alone?
In the present setup the generator target runs ds-identify which in turn
will disable cloud-init.target if no data source can be identified.
However, the Python code usually runs late enough that things that are
no available in early boot are found and data sources are identified
properly. If users that know they run in a specific environment could
set a "ds=no-check" flag on the kernel command line then the timing
issue could be prevented.
I realize for the nocloud case a user can set "ds=nocloud" on the kernel
command line to work around the timing issue described herein. Also a
"ds=no-check" would circumvent the basic intention of the generator to
allow cloud-init to be installed anywhere and simply detect quickly an
environment where cloud-init Python code should not be executed and thus
safe boot time.
My point is that, IMHO, timing issues in general cannot be avoided by
ds-identify due to the nature of when systemd executes the generators.
Thus giving users the general ability to disable ds-identify maybe
useful.
I am happy if I can be proven incorrect and the timing issue can be
resolved.
[1] https://www.freedesktop.org/wiki/Software/systemd/Generators/
** Affects: cloud-init
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1810859
Title:
ds-identify runs too early
Status in cloud-init:
New
Bug description:
ds-identify is executed from a systemd generator [1]. Based on my
understanding of the intention of both this creates a non resolvable
timing conflict.
Generators run very early in the boot process.
The cloud-init generator runs ds-identify which in turn runs "blkid"
to find filesystems with specific labels, "cidata" for the nocloud
data source. However, it is possible to construct an environment where
the filesystem with the "cidata" label is on an attached device and
the generator runs prior to the attached device being known to the
kernel and thus the return of blkid cannot reflect the proper status,
meaning the "cidata" label cannot be found and thus the "nocloud" data
source is not properly identified. This implies that the cloud-
init.target unit will be disabled.
Observed in a test environment with qemu and the data source on a
separate virtual device.
According to [1] we shouldn't add any sync points such as "udevadm
settle", thus I am not certain how this could be resolved. Also given
that we cannot control the timing of the execution of the generator it
appears that this is going to be difficult to get under control.
Would it make sense to give ds-identify the option to simply exit and
leave things alone?
In the present setup the generator target runs ds-identify which in
turn will disable cloud-init.target if no data source can be
identified. However, the Python code usually runs late enough that
things that are no available in early boot are found and data sources
are identified properly. If users that know they run in a specific
environment could set a "ds=no-check" flag on the kernel command line
then the timing issue could be prevented.
I realize for the nocloud case a user can set "ds=nocloud" on the
kernel command line to work around the timing issue described herein.
Also a "ds=no-check" would circumvent the basic intention of the
generator to allow cloud-init to be installed anywhere and simply
detect quickly an environment where cloud-init Python code should not
be executed and thus safe boot time.
My point is that, IMHO, timing issues in general cannot be avoided by
ds-identify due to the nature of when systemd executes the generators.
Thus giving users the general ability to disable ds-identify maybe
useful.
I am happy if I can be proven incorrect and the timing issue can be
resolved.
[1] https://www.freedesktop.org/wiki/Software/systemd/Generators/
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1810859/+subscriptions
Follow ups