yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90173
[Bug 1995609] [NEW] ssh host keys deleted by cloud-init between sshd-keygen and sshd start
Public bug reported:
This happened on a CentOS Stream 8.
I created an AWS instance from a snapshot of another instance.
Upon start I was unable to login via SSH because it failed to start.
Upon log investigation I found out that cloud-init deleted the files
from /etc/ssh/ssh_host_* between `sshd-keygen.target` and starting of
OpenSSH.
I recovered the instance in another way but I dug the logs.
Here are the logs extracts:
messages:
Nov 3 08:30:38 ip-172-21-3-249 systemd[1]: Reached target sshd-keygen.target.
cloud-init.log:
2022-11-03 08:31:02,307 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key.pub
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key.pub
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key.pub
messages:
Nov 3 08:31:02 ip-172-21-3-249 systemd[1]: Starting OpenSSH server daemon...
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_rsa_key
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ecdsa_key
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ed25519_key
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: sshd: no hostkeys available -- exiting.
Nov 3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Main process exited, code=exited, status=1/FAILURE
Nov 3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Failed with result 'exit-code'.
Nov 3 08:31:03 ip-172-21-3-249 systemd[1]: Failed to start OpenSSH server daemon.
The cloud-init file has the right dependencies:
[root@ip-172-21-3-249 log]# more /usr/lib/systemd/system/cloud-init.service
[Unit]
Description=Initial cloud-init job (metadata service crawler)
DefaultDependencies=no
Wants=cloud-init-local.service
Wants=sshd-keygen.service
Wants=sshd.service
After=cloud-init-local.service
After=systemd-networkd-wait-online.service
After=network.service
After=NetworkManager.service
Before=network-online.target
Before=sshd-keygen.service
Before=sshd.service
Before=systemd-user-sessions.service
[Service]
Type=oneshot
ExecStart=/usr/bin/cloud-init init
RemainAfterExit=yes
TimeoutSec=0
# Output needs to appear in instance console output
StandardOutput=journal+console
[Install]
WantedBy=cloud-init.target
But I wonder if they still work for SystemD templates:
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen.service
Unit sshd-keygen.service could not be found.
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@.service
Failed to get properties: Unit name sshd-keygen@.service is neither a valid invocation ID nor unit name.
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@
sshd-keygen@ecdsa.service sshd-keygen@ed25519.service sshd-keygen@rsa.service ««« there are 3 services each for it's key type.
I can see that the keygen is disabled here because cloud-init is disabled:
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@ed25519.service
● sshd-keygen@ed25519.service - OpenSSH ed25519 Server Key Generation
Loaded: loaded (/usr/lib/systemd/system/sshd-keygen@.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/sshd-keygen@.service.d
└─disable-sshd-keygen-if-cloud-init-active.conf
Active: inactive (dead)
Condition: start condition failed at Thu 2022-11-03 10:18:28 UTC; 3h 4min ago
└─ ConditionPathExists=!/run/systemd/generator.early/multi-user.target.wants/cloud-init.target was not met
How can we ensure this does not happen in the future?
** Affects: cloud-init
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1995609
Title:
ssh host keys deleted by cloud-init between sshd-keygen and sshd start
Status in cloud-init:
New
Bug description:
This happened on a CentOS Stream 8.
I created an AWS instance from a snapshot of another instance.
Upon start I was unable to login via SSH because it failed to start.
Upon log investigation I found out that cloud-init deleted the files
from /etc/ssh/ssh_host_* between `sshd-keygen.target` and starting of
OpenSSH.
I recovered the instance in another way but I dug the logs.
Here are the logs extracts:
messages:
Nov 3 08:30:38 ip-172-21-3-249 systemd[1]: Reached target sshd-keygen.target.
cloud-init.log:
2022-11-03 08:31:02,307 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key.pub
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key.pub
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key.pub
messages:
Nov 3 08:31:02 ip-172-21-3-249 systemd[1]: Starting OpenSSH server daemon...
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_rsa_key
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ecdsa_key
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ed25519_key
Nov 3 08:31:03 ip-172-21-3-249 sshd[1337]: sshd: no hostkeys available -- exiting.
Nov 3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Main process exited, code=exited, status=1/FAILURE
Nov 3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Failed with result 'exit-code'.
Nov 3 08:31:03 ip-172-21-3-249 systemd[1]: Failed to start OpenSSH server daemon.
The cloud-init file has the right dependencies:
[root@ip-172-21-3-249 log]# more /usr/lib/systemd/system/cloud-init.service
[Unit]
Description=Initial cloud-init job (metadata service crawler)
DefaultDependencies=no
Wants=cloud-init-local.service
Wants=sshd-keygen.service
Wants=sshd.service
After=cloud-init-local.service
After=systemd-networkd-wait-online.service
After=network.service
After=NetworkManager.service
Before=network-online.target
Before=sshd-keygen.service
Before=sshd.service
Before=systemd-user-sessions.service
[Service]
Type=oneshot
ExecStart=/usr/bin/cloud-init init
RemainAfterExit=yes
TimeoutSec=0
# Output needs to appear in instance console output
StandardOutput=journal+console
[Install]
WantedBy=cloud-init.target
But I wonder if they still work for SystemD templates:
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen.service
Unit sshd-keygen.service could not be found.
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@.service
Failed to get properties: Unit name sshd-keygen@.service is neither a valid invocation ID nor unit name.
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@
sshd-keygen@ecdsa.service sshd-keygen@ed25519.service sshd-keygen@rsa.service ««« there are 3 services each for it's key type.
I can see that the keygen is disabled here because cloud-init is disabled:
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@ed25519.service
● sshd-keygen@ed25519.service - OpenSSH ed25519 Server Key Generation
Loaded: loaded (/usr/lib/systemd/system/sshd-keygen@.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/sshd-keygen@.service.d
└─disable-sshd-keygen-if-cloud-init-active.conf
Active: inactive (dead)
Condition: start condition failed at Thu 2022-11-03 10:18:28 UTC; 3h 4min ago
└─ ConditionPathExists=!/run/systemd/generator.early/multi-user.target.wants/cloud-init.target was not met
How can we ensure this does not happen in the future?
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1995609/+subscriptions
Follow ups