← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1605749] Re: ConfigDrive: cloud-init fails to configure bond from network_data.json

 

This bug was fixed in the package cloud-init -
0.7.8-1-g3705bb5-0ubuntu1~16.04.1

---------------
cloud-init (0.7.8-1-g3705bb5-0ubuntu1~16.04.1) xenial-proposed; urgency=medium

  * New upstream release 0.7.8.
  * New upstream snapshot.
    - systemd: put cloud-init.target After multi-user.target (LP: #1623868)

cloud-init (0.7.7-31-g65ace7b-0ubuntu1~16.04.2) xenial-proposed;
urgency=medium

  * debian/control: add Breaks of older versions of walinuxagent (LP:
#1623570)

cloud-init (0.7.7-31-g65ace7b-0ubuntu1~16.04.1) xenial-proposed;
urgency=medium

  * debian/control: fix missing dependency on python3-serial,
    and make SmartOS datasource work.
  * debian/cloud-init.templates fix capitalisation in template so
    dpkg-reconfigure works to select OpenStack. (LP: #1575727)
  * d/README.source, d/control, d/new-upstream-snapshot, d/rules: sync
    with yakkety for changes due to move to git.
  * d/rules: change PYVER=python3 to PYVER=3 to adjust to upstream change.
  * debian/rules, debian/cloud-init.install: remove install file
    to ensure expected files are collected into cloud-init deb.
    (LP: #1615745)
  * debian/dirs: remove obsolete / unused file.
  * upstream move from bzr to git.
  * New upstream snapshot.
    - Allow link type of null in network_data.json [Jon Grimm] (LP: #1621968)
    - DataSourceOVF: fix user-data as base64 with python3 (LP: #1619394)
    - remove obsolete .bzrignore
    - systemd: Better support package and upgrade. (LP: #1576692, #1621336)
    - tests: cleanup tempdirs in apt_source tests
    - apt config conversion: treat empty string as not provided. (LP: #1621180)
    - Fix typo in default keys for phone_home [Roland Sommer] (LP: #1607810)
    - salt minion: update default pki directory for newer salt minion.
      (LP: #1609899)
    - bddeb: add --release flag to specify the release in changelog.
    - apt-config: allow both old and new format to be present.
      [Christian Ehrhardt] (LP: #1616831)
    - python2.6: fix dict comprehension usage in _lsb_release. [Joshua Harlow]
    - Add a module that can configure spacewalk. [Joshua Harlow]
    - add install option for openrc [Matthew Thode]
    - Generate a dummy bond name for OpenStack (LP: #1605749)
    - network: fix get_interface_mac for bond slave, read_sys_net for ENOTDIR
    - azure dhclient-hook cleanups
    - Minor cleanups to atomic_helper and add unit tests.
    - Fix Gentoo net config generation [Matthew Thode]
    - distros: fix get_primary_arch method use of os.uname [Andrew Jorgensen]
    - Apt: add new apt configuration format [Christian Ehrhardt]
    - Get Azure endpoint server from DHCP client [Brent Baude]
    - DigitalOcean: use the v1.json endpoint [Ben Howard]
    - MAAS: add vendor-data support (LP: #1612313)
    - Upgrade to a configobj package new enough to work [Joshua Harlow]
    - ConfigDrive: recognize 'tap' as a link type. (LP: #1610784)
    - NoCloud: fix bug providing network-interfaces via meta-data.
      (LP: 1577982)
    - Add distro tags on config modules that should have it [Joshua Harlow]
    - ChangeLog: update changelog for previous commit.
    - add ntp config module [Ryan Harper]
    - SmartOS: more improvements for network configuration
    - tools/read-version: update to address change in version
    - make-tarball: older versions of git with --format=tar.
    - read-version: do not attempt git-describe if no git.
    - Newer requests have strong type validation [Joshua Harlow]
    - For upstream snapshot versions do not modify git-describe output.
    - adjust signal_handler for version changes.
    - revert unintended change to ubuntu sources list
    - drop modification of version during make-tarball, tools changes.
    - adjust tools and version information.
    - Update build tools to work with git [Lars Kellogg-Stedman]
    - fix pep8 errors in mcollective unit tests
    - mcollective: add tests, cleanups and bug fix when no config in /etc.

 -- Scott Moser <smoser@xxxxxxxxxx>  Thu, 15 Sep 2016 09:57:27 -0400

** Changed in: cloud-init (Ubuntu Xenial)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1605749

Title:
  ConfigDrive: cloud-init fails to configure bond from network_data.json

Status in cloud-init:
  Fix Released
Status in cloud-init package in Ubuntu:
  Fix Released
Status in cloud-init source package in Xenial:
  Fix Released

Bug description:
  cloud-init fails to configure bond interfaces from network_data.json

  There is a couple of reasons:

  Bond links found in network_data.json do not have a name attribute.
  cloud-init doesn't require the name attribute to exist in links. [1]
  However cloud-init later expects the links to have a name attribute
  and crashes when it doesn't have any. [2] The name attribute is not
  part of the OpenStack network_data.json specification and will
  therefore never be provided.

  If a link name is provided, the generated ENI configuration has a
  couple of issues:

  1) cloud-init currently thinks that the bond_links attribute found in
  a bond link are actual physical interface names and not link id as
  expected.

  This means you end up with 4 physical interfaces configured on the
  server: 2 existing physical interfaces (ex.: eno1 and eno2) and 2
  physical interfaces based on the name found in bond_links (in that
  case, eth0 and eth1). The later don't exist on the server and
  configured bond interface tries to enslave non-existing links and
  fails to bring up.

  2) The "auto" stanza is missing from bond and bond slave interfaces.
  Interfaces are never started/configured properly at boot.

  3) Once 1) and 2) are fixed, it looks like cloud-init runs the network
  configuration again in dsmode=net and fails at multiple steps:

  3.1) get_interfaces_by_mac is run once again and tries to detect all
  known mac addresses by listing all entries found in /sys/class/net/.
  At this point, the bonding is up and the file 'bond_masters' exists.
  This means '/sys/class/net/bond_masters/address' won't exist (because
  /sys/class/net/bond_masters is a file, not a directory) and
  get_interface_mac will throw an uncatched exception, aborting the
  configuration process.

  3.2) Once 3.1) is fixed, configuration fails again but for a different
  reason. It is because once the bonding is configured, all slave
  interfaces will have their mac addresses updated so they are all
  identical. This means convert_net_json will fail at the "need_names"
  step and will throw this exception: "No mac_address or name entry for"
  because now the mac address of one of the physical interface isn't
  found.

  Here is attached to this bug a network_data.json for test purpose.

  For reference, here is the MAC address mapping on the server:
  - eno1: 0c:c4:7a:34:6e:3c
  - eno2: 0c:c4:7a:34:6e:3d

  Current rendered ENI is:

      auto lo
      iface lo inet loopback
          dns-nameservers 1.1.1.191 1.1.1.4

      iface eno1 inet manual
          mtu 1500

      iface eno2 inet manual
          mtu 1500

      iface bond0 inet manual
          bond_xmit_hash_policy layer3+4
          bond_miimon 100
          bond_mode 4
          bond-slaves none

      auto eth0
      iface eth0 inet manual
          bond_miimon 100
          bond-master bond0
          bond_mode 4
          bond_xmit_hash_policy layer3+4

      auto eth1
      iface eth1 inet manual
          bond_miimon 100
          bond-master bond0
          bond_mode 4
          bond_xmit_hash_policy layer3+4

      auto bond0.602
      iface bond0.602 inet static
          netmask 255.255.255.248
          address 2.2.2.13
          vlan-raw-device bond0
          hwaddress fa:16:3e:b3:72:30
          vlan_id 602
          post-up route add default gw 2.2.2.9 || true
          pre-down route del default gw 2.2.2.9 || true

      auto bond0.612
      iface bond0.612 inet static
          netmask 255.255.255.248
          address 10.0.1.5
          vlan-raw-device bond0
          hwaddress fa:16:3e:66:ab:a6
          vlan_id 612
          post-up route add -net 192.168.1.0 netmask 255.255.255.255 gw 10.0.1.1 || true
          pre-down route del -net 192.168.1.0 netmask 255.255.255.255 gw 10.0.1.1 || true

  [1] http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/sources/helpers/openstack.py#L547
  [2] http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/net/network_state.py#L284

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1605749/+subscriptions


References