← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1605749] Re: ConfigDrive: cloud-init fails to configure bond from network_data.json

 

** Also affects: cloud-init (Ubuntu)
   Importance: Undecided
       Status: New

** Changed in: cloud-init (Ubuntu)
       Status: New => Fix Released

** Also affects: cloud-init (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: cloud-init (Ubuntu Xenial)
       Status: New => In Progress

** Changed in: cloud-init (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: cloud-init (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1605749

Title:
  ConfigDrive: cloud-init fails to configure bond from network_data.json

Status in cloud-init:
  Fix Released
Status in cloud-init package in Ubuntu:
  Fix Released
Status in cloud-init source package in Xenial:
  In Progress

Bug description:
  cloud-init fails to configure bond interfaces from network_data.json

  There is a couple of reasons:

  Bond links found in network_data.json do not have a name attribute.
  cloud-init doesn't require the name attribute to exist in links. [1]
  However cloud-init later expects the links to have a name attribute
  and crashes when it doesn't have any. [2] The name attribute is not
  part of the OpenStack network_data.json specification and will
  therefore never be provided.

  If a link name is provided, the generated ENI configuration has a
  couple of issues:

  1) cloud-init currently thinks that the bond_links attribute found in
  a bond link are actual physical interface names and not link id as
  expected.

  This means you end up with 4 physical interfaces configured on the
  server: 2 existing physical interfaces (ex.: eno1 and eno2) and 2
  physical interfaces based on the name found in bond_links (in that
  case, eth0 and eth1). The later don't exist on the server and
  configured bond interface tries to enslave non-existing links and
  fails to bring up.

  2) The "auto" stanza is missing from bond and bond slave interfaces.
  Interfaces are never started/configured properly at boot.

  3) Once 1) and 2) are fixed, it looks like cloud-init runs the network
  configuration again in dsmode=net and fails at multiple steps:

  3.1) get_interfaces_by_mac is run once again and tries to detect all
  known mac addresses by listing all entries found in /sys/class/net/.
  At this point, the bonding is up and the file 'bond_masters' exists.
  This means '/sys/class/net/bond_masters/address' won't exist (because
  /sys/class/net/bond_masters is a file, not a directory) and
  get_interface_mac will throw an uncatched exception, aborting the
  configuration process.

  3.2) Once 3.1) is fixed, configuration fails again but for a different
  reason. It is because once the bonding is configured, all slave
  interfaces will have their mac addresses updated so they are all
  identical. This means convert_net_json will fail at the "need_names"
  step and will throw this exception: "No mac_address or name entry for"
  because now the mac address of one of the physical interface isn't
  found.

  Here is attached to this bug a network_data.json for test purpose.

  For reference, here is the MAC address mapping on the server:
  - eno1: 0c:c4:7a:34:6e:3c
  - eno2: 0c:c4:7a:34:6e:3d

  Current rendered ENI is:

      auto lo
      iface lo inet loopback
          dns-nameservers 1.1.1.191 1.1.1.4

      iface eno1 inet manual
          mtu 1500

      iface eno2 inet manual
          mtu 1500

      iface bond0 inet manual
          bond_xmit_hash_policy layer3+4
          bond_miimon 100
          bond_mode 4
          bond-slaves none

      auto eth0
      iface eth0 inet manual
          bond_miimon 100
          bond-master bond0
          bond_mode 4
          bond_xmit_hash_policy layer3+4

      auto eth1
      iface eth1 inet manual
          bond_miimon 100
          bond-master bond0
          bond_mode 4
          bond_xmit_hash_policy layer3+4

      auto bond0.602
      iface bond0.602 inet static
          netmask 255.255.255.248
          address 2.2.2.13
          vlan-raw-device bond0
          hwaddress fa:16:3e:b3:72:30
          vlan_id 602
          post-up route add default gw 2.2.2.9 || true
          pre-down route del default gw 2.2.2.9 || true

      auto bond0.612
      iface bond0.612 inet static
          netmask 255.255.255.248
          address 10.0.1.5
          vlan-raw-device bond0
          hwaddress fa:16:3e:66:ab:a6
          vlan_id 612
          post-up route add -net 192.168.1.0 netmask 255.255.255.255 gw 10.0.1.1 || true
          pre-down route del -net 192.168.1.0 netmask 255.255.255.255 gw 10.0.1.1 || true

  [1] http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/sources/helpers/openstack.py#L547
  [2] http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/net/network_state.py#L284

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1605749/+subscriptions


References