← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1998526] [NEW] cloud-init not restarting ssh service after writing sshd_config

 

Public bug reported:

cloud-init 22.2 introduced a race condition with ssh.service since it
added a systemctl call to check if service is stopped/not-started. If
ssh.service starts at around the same time of the check, cloud-init
believes that the service is stopped and does not need to be restarts,
proceeds to write the sshd_config. The  writing of sshd_config might
happen after the ssh.service has started, especially if there's
significant delay in the systemctl status call (the delay would have
given ssh.service plenty of time to startup)

I believed this is the commit that introduced the issue:
https://github.com/canonical/cloud-init/commit/5054ffe188

I've attached the cloud-init.log and the auth.log showing the time ssh
service started.

>From cloud-init.log - the call to check ssh service status happened at
22:44:43,630, when cloud-init wrote the file sshd_config at 22:44:51,
ssh service already started. There's a strange 8s delay from systemctl
that  might have to do with systemd or the condition of the VM.
Regardless, the race is definitely there.

2022-11-22 22:44:43,630 - subp.py[DEBUG]: Running command ['systemctl', 'status', 'ssh'] with allowed return codes [0] (shell=False, capture=True)
2022-11-22 22:44:51,116 - cc_set_passwords.py[DEBUG]: Writing config 'ssh_pwauth: True'. SSH service 'ssh' will not be restarted because it is stopped.
2022-11-22 22:44:51,116 - util.py[DEBUG]: Reading from /etc/ssh/sshd_config (quiet=False)
2022-11-22 22:44:51,116 - util.py[DEBUG]: Read 3546 bytes from /etc/ssh/sshd_config
2022-11-22 22:44:51,116 - ssh_util.py[DEBUG]: line 55: option PasswordAuthentication updated no -> yes
2022-11-22 22:44:51,117 - util.py[DEBUG]: Writing to /etc/ssh/sshd_config - wb: [600] 3547 bytes

Here's what the customer gave us from their VM:

ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2022-11-22 22:44:43 UTC; 6 days ago 

sshd_config file changed after only few seconds.

-rw------- 1 root root 3547 2022-11-22 22:44:51.113697898 +0000
/etc/ssh/sshd_config

** Affects: cloud-init
     Importance: Undecided
         Status: New

** Attachment added: "ssh_race_logs.zip"
   https://bugs.launchpad.net/bugs/1998526/+attachment/5633811/+files/ssh_race_logs.zip

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1998526

Title:
  cloud-init not restarting ssh service after writing sshd_config

Status in cloud-init:
  New

Bug description:
  cloud-init 22.2 introduced a race condition with ssh.service since it
  added a systemctl call to check if service is stopped/not-started. If
  ssh.service starts at around the same time of the check, cloud-init
  believes that the service is stopped and does not need to be restarts,
  proceeds to write the sshd_config. The  writing of sshd_config might
  happen after the ssh.service has started, especially if there's
  significant delay in the systemctl status call (the delay would have
  given ssh.service plenty of time to startup)

  I believed this is the commit that introduced the issue:
  https://github.com/canonical/cloud-init/commit/5054ffe188

  I've attached the cloud-init.log and the auth.log showing the time ssh
  service started.

  From cloud-init.log - the call to check ssh service status happened at
  22:44:43,630, when cloud-init wrote the file sshd_config at 22:44:51,
  ssh service already started. There's a strange 8s delay from systemctl
  that  might have to do with systemd or the condition of the VM.
  Regardless, the race is definitely there.

  2022-11-22 22:44:43,630 - subp.py[DEBUG]: Running command ['systemctl', 'status', 'ssh'] with allowed return codes [0] (shell=False, capture=True)
  2022-11-22 22:44:51,116 - cc_set_passwords.py[DEBUG]: Writing config 'ssh_pwauth: True'. SSH service 'ssh' will not be restarted because it is stopped.
  2022-11-22 22:44:51,116 - util.py[DEBUG]: Reading from /etc/ssh/sshd_config (quiet=False)
  2022-11-22 22:44:51,116 - util.py[DEBUG]: Read 3546 bytes from /etc/ssh/sshd_config
  2022-11-22 22:44:51,116 - ssh_util.py[DEBUG]: line 55: option PasswordAuthentication updated no -> yes
  2022-11-22 22:44:51,117 - util.py[DEBUG]: Writing to /etc/ssh/sshd_config - wb: [600] 3547 bytes

  Here's what the customer gave us from their VM:

  ssh.service - OpenBSD Secure Shell server
     Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-11-22 22:44:43 UTC; 6 days ago 

  sshd_config file changed after only few seconds.

  -rw------- 1 root root 3547 2022-11-22 22:44:51.113697898 +0000
  /etc/ssh/sshd_config

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1998526/+subscriptions



Follow ups