← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1839854] [NEW] CloudStack provider cannot determine correct metadata IP with multiple network interfaces

 

Public bug reported:

[Problem]
When mutliple network interfaces are present in a CloudStack VM, cloud-init randomly chooses the gateway address to fetch the metadata from. This is not a problem when all network interfaces offer metadata. However, if a shared network interface is attached to the VM the gateway on that interface doesn't have the metadata. Cloud-init will timeout waiting for response from the gateway and will not apply metadata to the host.

[How to reproduce]
- Create VM with 1x Isolated and 1x Shared Network
- Ensure cloud-init is installed in the VM and CloudStack is configured as a metadata provider
- Boot VM

[Expected result]
- VM should boot and apply metadata from cloudstack

[Observed result]
- cloud-init sometimes chooses wrong metadata server IP
- cloud-init delays startup waiting for response
- metadata isn't applied 
- cloud-init service fails

[Notes]
I noticed that in "cloudinit/sources/DataSourceCloudStack.py" get_vr_address() the dhcp lease option is preferred over the default gateway. Wouldn't it be smarter to just always use "get_default_gateway()"?
We used till recently cloud-init 0.7.5 but after the introduction of NetworkManager lease support  we started running into this problem. (https://github.com/cloud-init/cloud-init/commit/33816e96d8981918f734dab3ee1a967bce85451a#diff-5bc9de2bb7889d66205845400c7cf99bR182)
Up to this point cloud-init has always used the default_gateway method.
CentOS 7 has only recently updated cloud-init in it's repos, so we were stuck on this old version for a long time.

Maybe it would be nice to have a configuration option to choose between the methods manually?
Also it would be cool if on a fault cloud-init would choose the next possible dhcp lease.

[Attachment]
We added some files for debugging as a tar.gz.

** Affects: cloud-init
     Importance: Undecided
         Status: New

** Attachment added: "cloud-init-logs.tar.gz"
   https://bugs.launchpad.net/bugs/1839854/+attachment/5282207/+files/cloud-init-logs.tar.gz

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1839854

Title:
  CloudStack provider cannot determine correct metadata IP with multiple
  network interfaces

Status in cloud-init:
  New

Bug description:
  [Problem]
  When mutliple network interfaces are present in a CloudStack VM, cloud-init randomly chooses the gateway address to fetch the metadata from. This is not a problem when all network interfaces offer metadata. However, if a shared network interface is attached to the VM the gateway on that interface doesn't have the metadata. Cloud-init will timeout waiting for response from the gateway and will not apply metadata to the host.

  [How to reproduce]
  - Create VM with 1x Isolated and 1x Shared Network
  - Ensure cloud-init is installed in the VM and CloudStack is configured as a metadata provider
  - Boot VM

  [Expected result]
  - VM should boot and apply metadata from cloudstack

  [Observed result]
  - cloud-init sometimes chooses wrong metadata server IP
  - cloud-init delays startup waiting for response
  - metadata isn't applied 
  - cloud-init service fails

  [Notes]
  I noticed that in "cloudinit/sources/DataSourceCloudStack.py" get_vr_address() the dhcp lease option is preferred over the default gateway. Wouldn't it be smarter to just always use "get_default_gateway()"?
  We used till recently cloud-init 0.7.5 but after the introduction of NetworkManager lease support  we started running into this problem. (https://github.com/cloud-init/cloud-init/commit/33816e96d8981918f734dab3ee1a967bce85451a#diff-5bc9de2bb7889d66205845400c7cf99bR182)
  Up to this point cloud-init has always used the default_gateway method.
  CentOS 7 has only recently updated cloud-init in it's repos, so we were stuck on this old version for a long time.

  Maybe it would be nice to have a configuration option to choose between the methods manually?
  Also it would be cool if on a fault cloud-init would choose the next possible dhcp lease.

  [Attachment]
  We added some files for debugging as a tar.gz.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1839854/+subscriptions


Follow ups