yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74889
[Bug 1794399] [NEW] cloud-init dhcp_discovery() crashes on preprovisioned RHEL 7.6 VM in Azure
Public bug reported:
Azure, creating a RHEL 7.6 VM from a pool of preprovisioned VM
In /usr/lib/python2.7/site-packages/cloudinit/net/dhcp.py,
dhcp_discovery() starts dhclient specifically so it will capture the
DHCP leases in dhcp.leases. The function copies the dhclient binary and
starts it with options naming unique lease and pid files. The function
then waits for both the lease and pid files to appear before using the
contents of the pid file to kill the dhclient instance.
There’s a behavior difference between the Ubuntu and RHEL versions of dhclient:
• On Ubuntu, dhclient writes the DHCP lease response, forks/daemonizes, then writes the pid file with the daemonized process ID.
• On RHEL, dhclient writes a pid file with the pre-daemon pid, writes the DHCP lease response, forks/daemonizes, then overwrites the pid file with the new (daemonized) pid.
On RHEL, there’s a race between dhcp_discovery() and dhclient:
1. dhclient writes the pid file and lease file
2. dhclient forks; the parent process exits
3. dhcp_discovery() sees that the pid file and lease file exist
4. dhcp_discovery() tries to kill the process named in the pid file, but it already exited in step 2
5. dhclient child starts, daemonizes, and writes its pid in the pid file
When cloud-init runs on a preprovisioned RHEL 7.6 VM in Azure, dhcp.py
dhcp_discovery() throws an error when it tries to send SIGKILL to a
process that does not exist.
We have a patch that makes dhcp_discovery() wait until the pid in the
pid file represents a daemon process (parent pid is 1) before killing
the process. With this change, the issue is resolved.
** Affects: cloud-init
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1794399
Title:
cloud-init dhcp_discovery() crashes on preprovisioned RHEL 7.6 VM in
Azure
Status in cloud-init:
New
Bug description:
Azure, creating a RHEL 7.6 VM from a pool of preprovisioned VM
In /usr/lib/python2.7/site-packages/cloudinit/net/dhcp.py,
dhcp_discovery() starts dhclient specifically so it will capture the
DHCP leases in dhcp.leases. The function copies the dhclient binary
and starts it with options naming unique lease and pid files. The
function then waits for both the lease and pid files to appear before
using the contents of the pid file to kill the dhclient instance.
There’s a behavior difference between the Ubuntu and RHEL versions of dhclient:
• On Ubuntu, dhclient writes the DHCP lease response, forks/daemonizes, then writes the pid file with the daemonized process ID.
• On RHEL, dhclient writes a pid file with the pre-daemon pid, writes the DHCP lease response, forks/daemonizes, then overwrites the pid file with the new (daemonized) pid.
On RHEL, there’s a race between dhcp_discovery() and dhclient:
1. dhclient writes the pid file and lease file
2. dhclient forks; the parent process exits
3. dhcp_discovery() sees that the pid file and lease file exist
4. dhcp_discovery() tries to kill the process named in the pid file, but it already exited in step 2
5. dhclient child starts, daemonizes, and writes its pid in the pid file
When cloud-init runs on a preprovisioned RHEL 7.6 VM in Azure, dhcp.py
dhcp_discovery() throws an error when it tries to send SIGKILL to a
process that does not exist.
We have a patch that makes dhcp_discovery() wait until the pid in the
pid file represents a daemon process (parent pid is 1) before killing
the process. With this change, the issue is resolved.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1794399/+subscriptions
Follow ups