← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1553815] [NEW] host keys never restored following metadata api outage

 

Public bug reported:

We are running an Openstack cloud and have noticed some unexpected
behaviour in our Ubuntu Trusty cloud instances created by Nova.

We have observed that if a previously initialised instance (e.g.
DataSourceOpenstack has already been run) is rebooted while the metadata
api is not available (i.e. 169.254.169.264 is unreachable), cloud-init
will retry a few times then switch to DataSourceNone and regenerate host
keys.

    # Boot instance under normal conditions    
    ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
    /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95
    ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log 
    Generating public/private rsa key pair.

    # Stop neutron metadata api service and reboot instance (observing that host keys were regenerated)
    ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
    /var/lib/cloud/instances/iid-datasource-none
    ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log 
    Generating public/private rsa key pair.
    Generating public/private rsa key pair.

So far so good since we expect this behaviour, but now we reboot this instance with the metadata api is once again reachable. Cloud-init rightly selects the original DataSourceOpenstack instance but it does nothing since it already ran once (and it is set to only run once). The problem here is that the original host keys are never 
restored so any client connecting to that instance will have no option to accept the new host keys along with MITM attack warning.

    ubuntu@vm1:~$ sudo reboot
    ...
    ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
    /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95

Surely we could find a way for cloud-init to know that if if the current
DataSourceOpenstack uuid matches its previously run uuid, then it can
check that the host keys are consistent with the original run. @smoser
suggested in a side discussion that dmidecode info could perhaps be used
since the Openstack instance uuid can be found there:

    ubuntu@vm1:~$ sudo dmidecode -t system
    # dmidecode 2.12
    SMBIOS 2.8 present.

    Handle 0x0100, DMI type 1, 27 bytes
    System Information
	    Manufacturer: OpenStack Foundation
	    Product Name: OpenStack Nova
	    Version: 13.0.0
	    Serial Number: ba5f7371-fd4c-a25e-132f-3dd1e5b92e93
	    UUID: CD535BC4-9C2F-4D31-8903-0EDE59C7EF95
	    Wake-up Type: Power Switch
	    SKU Number: Not Specified
	    Family: Virtual Machine

    Handle 0x2000, DMI type 32, 11 bytes
    System Boot Information
	    Status: No errors detected

If cloud-init kept a copy of previous host keys prior to regenerating
them, it could presumably use this info to know when to safely restore
the original host keys.

Since it is not inconceivable for the metadata api to become unreachable
for a brief period (perhpas during an upgrade), i think we really need
to make cloud-init more tolerant of this circumstance.

** Affects: cloud-init
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1553815

Title:
  host keys never restored following metadata api outage

Status in cloud-init:
  New

Bug description:
  We are running an Openstack cloud and have noticed some unexpected
  behaviour in our Ubuntu Trusty cloud instances created by Nova.

  We have observed that if a previously initialised instance (e.g.
  DataSourceOpenstack has already been run) is rebooted while the
  metadata api is not available (i.e. 169.254.169.264 is unreachable),
  cloud-init will retry a few times then switch to DataSourceNone and
  regenerate host keys.

      # Boot instance under normal conditions    
      ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
      /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95
      ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log 
      Generating public/private rsa key pair.

      # Stop neutron metadata api service and reboot instance (observing that host keys were regenerated)
      ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
      /var/lib/cloud/instances/iid-datasource-none
      ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log 
      Generating public/private rsa key pair.
      Generating public/private rsa key pair.

  So far so good since we expect this behaviour, but now we reboot this instance with the metadata api is once again reachable. Cloud-init rightly selects the original DataSourceOpenstack instance but it does nothing since it already ran once (and it is set to only run once). The problem here is that the original host keys are never 
  restored so any client connecting to that instance will have no option to accept the new host keys along with MITM attack warning.

      ubuntu@vm1:~$ sudo reboot
      ...
      ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance
      /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95

  Surely we could find a way for cloud-init to know that if if the
  current DataSourceOpenstack uuid matches its previously run uuid, then
  it can check that the host keys are consistent with the original run.
  @smoser suggested in a side discussion that dmidecode info could
  perhaps be used since the Openstack instance uuid can be found there:

      ubuntu@vm1:~$ sudo dmidecode -t system
      # dmidecode 2.12
      SMBIOS 2.8 present.

      Handle 0x0100, DMI type 1, 27 bytes
      System Information
  	    Manufacturer: OpenStack Foundation
  	    Product Name: OpenStack Nova
  	    Version: 13.0.0
  	    Serial Number: ba5f7371-fd4c-a25e-132f-3dd1e5b92e93
  	    UUID: CD535BC4-9C2F-4D31-8903-0EDE59C7EF95
  	    Wake-up Type: Power Switch
  	    SKU Number: Not Specified
  	    Family: Virtual Machine

      Handle 0x2000, DMI type 32, 11 bytes
      System Boot Information
  	    Status: No errors detected

  If cloud-init kept a copy of previous host keys prior to regenerating
  them, it could presumably use this info to know when to safely restore
  the original host keys.

  Since it is not inconceivable for the metadata api to become
  unreachable for a brief period (perhpas during an upgrade), i think we
  really need to make cloud-init more tolerant of this circumstance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1553815/+subscriptions


Follow ups