← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1995609] [NEW] ssh host keys deleted by cloud-init between sshd-keygen and sshd start

 

Public bug reported:

This happened on a CentOS Stream 8.

I created an AWS instance from a snapshot of another instance.
Upon start I was unable to login via SSH because it failed to start.

Upon log investigation I found out that cloud-init deleted the files
from /etc/ssh/ssh_host_* between `sshd-keygen.target` and starting of
OpenSSH.

I recovered the instance in another way but I dug the logs.
Here are the logs extracts:

messages:
Nov  3 08:30:38 ip-172-21-3-249 systemd[1]: Reached target sshd-keygen.target.

cloud-init.log:
2022-11-03 08:31:02,307 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key.pub
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key.pub
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key
2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key.pub

messages:
Nov  3 08:31:02 ip-172-21-3-249 systemd[1]: Starting OpenSSH server daemon...
Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_rsa_key
Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ecdsa_key
Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ed25519_key
Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: sshd: no hostkeys available -- exiting.
Nov  3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Main process exited, code=exited, status=1/FAILURE
Nov  3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Failed with result 'exit-code'.
Nov  3 08:31:03 ip-172-21-3-249 systemd[1]: Failed to start OpenSSH server daemon.

The cloud-init file has the right dependencies:

[root@ip-172-21-3-249 log]# more /usr/lib/systemd/system/cloud-init.service
[Unit]
Description=Initial cloud-init job (metadata service crawler)
DefaultDependencies=no
Wants=cloud-init-local.service
Wants=sshd-keygen.service
Wants=sshd.service
After=cloud-init-local.service
After=systemd-networkd-wait-online.service
After=network.service
After=NetworkManager.service
Before=network-online.target
Before=sshd-keygen.service
Before=sshd.service
Before=systemd-user-sessions.service

[Service]
Type=oneshot
ExecStart=/usr/bin/cloud-init init
RemainAfterExit=yes
TimeoutSec=0

# Output needs to appear in instance console output
StandardOutput=journal+console

[Install]
WantedBy=cloud-init.target


But I wonder if they still work for SystemD templates:

[root@ip-172-21-3-249 log]# systemctl status sshd-keygen.service
Unit sshd-keygen.service could not be found.
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@.service
Failed to get properties: Unit name sshd-keygen@.service is neither a valid invocation ID nor unit name.
[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@
sshd-keygen@ecdsa.service    sshd-keygen@ed25519.service  sshd-keygen@rsa.service ««« there are 3 services each for it's key type.


I can see that the keygen is disabled here because cloud-init is disabled:

[root@ip-172-21-3-249 log]# systemctl status sshd-keygen@ed25519.service 
● sshd-keygen@ed25519.service - OpenSSH ed25519 Server Key Generation
   Loaded: loaded (/usr/lib/systemd/system/sshd-keygen@.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/sshd-keygen@.service.d
           └─disable-sshd-keygen-if-cloud-init-active.conf
   Active: inactive (dead)
Condition: start condition failed at Thu 2022-11-03 10:18:28 UTC; 3h 4min ago
           └─ ConditionPathExists=!/run/systemd/generator.early/multi-user.target.wants/cloud-init.target was not met

How can we ensure this does not happen in the future?

** Affects: cloud-init
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1995609

Title:
  ssh host keys deleted by cloud-init between sshd-keygen and sshd start

Status in cloud-init:
  New

Bug description:
  This happened on a CentOS Stream 8.

  I created an AWS instance from a snapshot of another instance.
  Upon start I was unable to login via SSH because it failed to start.

  Upon log investigation I found out that cloud-init deleted the files
  from /etc/ssh/ssh_host_* between `sshd-keygen.target` and starting of
  OpenSSH.

  I recovered the instance in another way but I dug the logs.
  Here are the logs extracts:

  messages:
  Nov  3 08:30:38 ip-172-21-3-249 systemd[1]: Reached target sshd-keygen.target.

  cloud-init.log:
  2022-11-03 08:31:02,307 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key
  2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ed25519_key.pub
  2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key
  2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_ecdsa_key.pub
  2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key
  2022-11-03 08:31:02,308 - util.py[DEBUG]: Attempting to remove /etc/ssh/ssh_host_rsa_key.pub

  messages:
  Nov  3 08:31:02 ip-172-21-3-249 systemd[1]: Starting OpenSSH server daemon...
  Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_rsa_key
  Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ecdsa_key
  Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: Unable to load host key: /etc/ssh/ssh_host_ed25519_key
  Nov  3 08:31:03 ip-172-21-3-249 sshd[1337]: sshd: no hostkeys available -- exiting.
  Nov  3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Main process exited, code=exited, status=1/FAILURE
  Nov  3 08:31:03 ip-172-21-3-249 systemd[1]: sshd.service: Failed with result 'exit-code'.
  Nov  3 08:31:03 ip-172-21-3-249 systemd[1]: Failed to start OpenSSH server daemon.

  The cloud-init file has the right dependencies:

  [root@ip-172-21-3-249 log]# more /usr/lib/systemd/system/cloud-init.service
  [Unit]
  Description=Initial cloud-init job (metadata service crawler)
  DefaultDependencies=no
  Wants=cloud-init-local.service
  Wants=sshd-keygen.service
  Wants=sshd.service
  After=cloud-init-local.service
  After=systemd-networkd-wait-online.service
  After=network.service
  After=NetworkManager.service
  Before=network-online.target
  Before=sshd-keygen.service
  Before=sshd.service
  Before=systemd-user-sessions.service

  [Service]
  Type=oneshot
  ExecStart=/usr/bin/cloud-init init
  RemainAfterExit=yes
  TimeoutSec=0

  # Output needs to appear in instance console output
  StandardOutput=journal+console

  [Install]
  WantedBy=cloud-init.target

  
  But I wonder if they still work for SystemD templates:

  [root@ip-172-21-3-249 log]# systemctl status sshd-keygen.service
  Unit sshd-keygen.service could not be found.
  [root@ip-172-21-3-249 log]# systemctl status sshd-keygen@.service
  Failed to get properties: Unit name sshd-keygen@.service is neither a valid invocation ID nor unit name.
  [root@ip-172-21-3-249 log]# systemctl status sshd-keygen@
  sshd-keygen@ecdsa.service    sshd-keygen@ed25519.service  sshd-keygen@rsa.service ««« there are 3 services each for it's key type.

  
  I can see that the keygen is disabled here because cloud-init is disabled:

  [root@ip-172-21-3-249 log]# systemctl status sshd-keygen@ed25519.service 
  ● sshd-keygen@ed25519.service - OpenSSH ed25519 Server Key Generation
     Loaded: loaded (/usr/lib/systemd/system/sshd-keygen@.service; disabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/sshd-keygen@.service.d
             └─disable-sshd-keygen-if-cloud-init-active.conf
     Active: inactive (dead)
  Condition: start condition failed at Thu 2022-11-03 10:18:28 UTC; 3h 4min ago
             └─ ConditionPathExists=!/run/systemd/generator.early/multi-user.target.wants/cloud-init.target was not met

  How can we ensure this does not happen in the future?

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1995609/+subscriptions



Follow ups