yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #68175
[Bug 1720160] [NEW] cloud-init wait for waagent on Azure CentOS 7.4 - no sshd start
Public bug reported:
Hello,
after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via ssh because cloud-init try to start the waagent and the boot process hang. So sshd is stopped.
We install a fresh CentOS 7.4 in the Azure cloud to provide a base image
template for our company and this will also happens in this VM.
#######
# yum info cloud-init:
Name : cloud-init
Arch : x86_64
Version : 0.7.9
Release : 9.el7.centos.2
Size : 2.1 M
Repo : installed
>From repo : base
In CentOS 7.3 the cloud-init version is 0.7.5-10.el7.centos.1
waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is internal updated to 2.2.17 from waagent it self.
#######
To debug the failure I had to install rlogin before update:
yum remove firewalld -y
yum install rsh-server -y
systemctl enable rsh.socket
systemctl enable rlogin.socket
systemctl enable rexec.socket
echo "root:123" | chpasswd
echo "+ root" > ~/.rlogin
cat << EOF >> /etc/securetty
rsh
rexec
rlogin
EOF
reboot
yum update -y
reboot
#######
to unblock the process I have connect via rlogin and kill the waagent start:
# ps -ef | grep "waagent\|cloud"
root 993 1 0 14:52 ? 00:00:02 /usr/bin/python /usr/bin/cloud-init init
root 1134 993 0 14:52 ? 00:00:00 /bin/systemctl start waagent.service
root 1337 1222 0 15:56 pts/2 00:00:00 grep --color=auto waagent\|cloud
# kill 1134
Then cloud-init do magic and on the next reboot sshd start without any
trouble.
#######
To fail the VM again you can clear the config and reboot:
yum remove cloud-init WALinuxAgent -y
rm -f /etc/waagent.con*
rm -fr /etc/cloud/
rm -fr /var/lib/cloud/
rm -fr /var/lib/waagent/
rm -fr /var/log/waagent.lo*
rm -fr /var/log/cloud-init*
yum install cloud-init WALinuxAgent -y
cp -a /etc/waagent.conf /etc/waagent.conf.rpmsave
sed -i -e "s/Provisioning.Enabled.*/Provisioning.Enabled=n/g" /etc/waagent.conf
sed -i -e "s/Provisioning.UseCloudInit.*/Provisioning.UseCloudInit=y/g" /etc/waagent.conf
sed -i -e "s/Logs.Verbose.*/Logs.Verbose=y/g" /etc/waagent.conf
cp -a /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmsave
cat << EOF >> /etc/cloud/cloud.cfg
# From cloud-init docs
datasource:
Azure:
agent_command: [service, waagent, start]
debug:
verbose: True
EOF
diff /etc/waagent.conf.rpmsave /etc/waagent.conf
diff /etc/cloud/cloud.cfg.rpmsave /etc/cloud/cloud.cfg
reboot
#######
I didn't know why the system hang.
Can you please review this.
** Affects: cloud-init
Importance: Undecided
Status: New
** Attachment added: "cloud-init_hangs.tar"
https://bugs.launchpad.net/bugs/1720160/+attachment/4958120/+files/cloud-init_hangs.tar
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1720160
Title:
cloud-init wait for waagent on Azure CentOS 7.4 - no sshd start
Status in cloud-init:
New
Bug description:
Hello,
after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via ssh because cloud-init try to start the waagent and the boot process hang. So sshd is stopped.
We install a fresh CentOS 7.4 in the Azure cloud to provide a base
image template for our company and this will also happens in this VM.
#######
# yum info cloud-init:
Name : cloud-init
Arch : x86_64
Version : 0.7.9
Release : 9.el7.centos.2
Size : 2.1 M
Repo : installed
From repo : base
In CentOS 7.3 the cloud-init version is 0.7.5-10.el7.centos.1
waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is internal updated to 2.2.17 from waagent it self.
#######
To debug the failure I had to install rlogin before update:
yum remove firewalld -y
yum install rsh-server -y
systemctl enable rsh.socket
systemctl enable rlogin.socket
systemctl enable rexec.socket
echo "root:123" | chpasswd
echo "+ root" > ~/.rlogin
cat << EOF >> /etc/securetty
rsh
rexec
rlogin
EOF
reboot
yum update -y
reboot
#######
to unblock the process I have connect via rlogin and kill the waagent start:
# ps -ef | grep "waagent\|cloud"
root 993 1 0 14:52 ? 00:00:02 /usr/bin/python /usr/bin/cloud-init init
root 1134 993 0 14:52 ? 00:00:00 /bin/systemctl start waagent.service
root 1337 1222 0 15:56 pts/2 00:00:00 grep --color=auto waagent\|cloud
# kill 1134
Then cloud-init do magic and on the next reboot sshd start without any
trouble.
#######
To fail the VM again you can clear the config and reboot:
yum remove cloud-init WALinuxAgent -y
rm -f /etc/waagent.con*
rm -fr /etc/cloud/
rm -fr /var/lib/cloud/
rm -fr /var/lib/waagent/
rm -fr /var/log/waagent.lo*
rm -fr /var/log/cloud-init*
yum install cloud-init WALinuxAgent -y
cp -a /etc/waagent.conf /etc/waagent.conf.rpmsave
sed -i -e "s/Provisioning.Enabled.*/Provisioning.Enabled=n/g" /etc/waagent.conf
sed -i -e "s/Provisioning.UseCloudInit.*/Provisioning.UseCloudInit=y/g" /etc/waagent.conf
sed -i -e "s/Logs.Verbose.*/Logs.Verbose=y/g" /etc/waagent.conf
cp -a /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmsave
cat << EOF >> /etc/cloud/cloud.cfg
# From cloud-init docs
datasource:
Azure:
agent_command: [service, waagent, start]
debug:
verbose: True
EOF
diff /etc/waagent.conf.rpmsave /etc/waagent.conf
diff /etc/cloud/cloud.cfg.rpmsave /etc/cloud/cloud.cfg
reboot
#######
I didn't know why the system hang.
Can you please review this.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1720160/+subscriptions
Follow ups