← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1847114] [NEW] Infinibad network devices not configured correctly on ubuntu

 

Public bug reported:

In commit e7b0e5f72 (included in release 18.4), support was added for
configuring infiniband network devices. This only works on centos (using
the sysconfig renderer), and in testing on ubuntu the following issues
were encountered (using the eni renderer) (logs pasted here for
completeness):

I think this will be a relatively trivial change to cloudinit/net/eni.py


# dpkg -l cloud-init
ii  cloud-init 19.2-36-g059d049c-0ubuntu1~18. all                            
 
from cloud-init.log:
 
2019-10-07 11:47:00,828 - util.py[WARNING]: failed stage init-local
2019-10-07 11:47:00,828 - util.py[DEBUG]: failed stage init-local
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 653, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 362, in main_init
    init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 676, in apply_network_config
    netcfg, src = self._find_networking_config()
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 641, in _find_networking_config
    if self.datasource and hasattr(self.datasource, 'network_config'):
  File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceConfigDrive.py", line 152, in network_config
    self.network_json, known_macs=self.known_macs)
  File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/openstack.py", line 669, in convert_net_json
    raise ValueError("Unable to find a system nic for %s" % d)
ValueError: Unable to find a system nic for {'type': 'physical', 'mtu': 9000, 'subnets': [{'type': 'static', 'netmask': '255.255.255.0', 'routes': [], 'address': '192.168.202.26', 'ipv4': True}], 'mac_address': 'aa:aa:aa:aa:aa:aa'}
 
 
from stdout when cloud init run manually:
 
root@iband1# cloud-init --debug init --local
2019-10-07 12:38:09,527 - handlers.py[DEBUG]: start: init-local: searching for local datasources
2019-10-07 12:38:09,527 - util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
2019-10-07 12:38:09,527 - util.py[DEBUG]: Read 17 bytes from /proc/uptime
2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /run/cloud-init/status.json
2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /run/cloud-init/result.json
2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/data/status.json
2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/data/result.json
2019-10-07 12:38:09,528 - atomic_helper.py[DEBUG]: Atomically writing to file /var/lib/cloud/data/status.json (via temporary file /var/lib/cloud/data/tmpmm2igkqk) - w: [644] 469 bytes/chars
2019-10-07 12:38:09,528 - util.py[DEBUG]: Creating symbolic link from '/run/cloud-init/status.json' => '../../var/lib/cloud/data/status.json'
2019-10-07 12:38:09,528 - util.py[DEBUG]: Running command ['systemd-detect-virt', '--quiet', '--container'] with allowed return codes [0] (shell=False, capture=True)
2019-10-07 12:38:09,532 - util.py[DEBUG]: Running command ['running-in-container'] with allowed return codes [0] (shell=False, capture=True)
2019-10-07 12:38:09,535 - util.py[DEBUG]: Running command ['lxc-is-container'] with allowed return codes [0] (shell=False, capture=True)
2019-10-07 12:38:09,537 - util.py[DEBUG]: Reading from /proc/1/environ (quiet=False)
2019-10-07 12:38:09,537 - util.py[DEBUG]: Read 187 bytes from /proc/1/environ
2019-10-07 12:38:09,537 - util.py[DEBUG]: Reading from /proc/self/status (quiet=False)
2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 1313 bytes from /proc/self/status
2019-10-07 12:38:09,538 - util.py[DEBUG]: Reading from /proc/cmdline (quiet=False)
2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 126 bytes from /proc/cmdline
2019-10-07 12:38:09,538 - util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 17 bytes from /proc/uptime
2019-10-07 12:38:09,538 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg (quiet=False)
2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 3169 bytes from /etc/cloud/cloud.cfg
2019-10-07 12:38:09,538 - util.py[DEBUG]: Attempting to load yaml from string of length 3169 with allowed root types (<class 'dict'>,)
2019-10-07 12:38:09,547 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg.d/90_dpkg.cfg (quiet=False)
2019-10-07 12:38:09,547 - util.py[DEBUG]: Read 114 bytes from /etc/cloud/cloud.cfg.d/90_dpkg.cfg
2019-10-07 12:38:09,547 - util.py[DEBUG]: Attempting to load yaml from string of length 114 with allowed root types (<class 'dict'>,)
2019-10-07 12:38:09,548 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg.d/05_logging.cfg (quiet=False)
2019-10-07 12:38:09,548 - util.py[DEBUG]: Read 2057 bytes from /etc/cloud/cloud.cfg.d/05_logging.cfg
2019-10-07 12:38:09,548 - util.py[DEBUG]: Attempting to load yaml from string of length 2057 with allowed root types (<class 'dict'>,)
2019-10-07 12:38:09,551 - util.py[DEBUG]: Reading from /run/cloud-init/cloud.cfg (quiet=False)
2019-10-07 12:38:09,551 - util.py[DEBUG]: Read 39 bytes from /run/cloud-init/cloud.cfg
2019-10-07 12:38:09,551 - util.py[DEBUG]: Attempting to load yaml from string of length 39 with allowed root types (<class 'dict'>,)
2019-10-07 12:38:09,551 - util.py[DEBUG]: Attempting to load yaml from string of length 0 with allowed root types (<class 'dict'>,)
2019-10-07 12:38:09,551 - util.py[DEBUG]: loaded blob returned None, returning default.
2019-10-07 12:38:09,552 - util.py[DEBUG]: Redirecting <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'> to | tee -a /var/log/cloud-init-output.log
2019-10-07 12:38:09,554 - util.py[DEBUG]: Redirecting <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'> to | tee -a /var/log/cloud-init-output.log
2019-10-07 12:38:09,554 - main.py[DEBUG]: Logging being reset, this logger may no longer be active shortly
Cloud-init v. 19.2-36-g059d049c-0ubuntu1~18.04.1 running 'init-local' at Mon, 07 Oct 2019 12:38:09 +0000. Up 3130.80 seconds.
2019-10-07 12:38:09,684 - util.py[WARNING]: failed stage init-local
failed run of stage init-local
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 653, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 362, in main_init
    init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 706, in apply_network_config
    return self.distro.apply_network_config(netcfg, bring_up=bring_up)
  File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 178, in apply_network_config
    dev_names = self._write_network_config(netconfig)
  File "/usr/lib/python3/dist-packages/cloudinit/distros/debian.py", line 114, in _write_network_config
    return self._supported_write_network_config(netconfig)
  File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 93, in _supported_write_network_config
    renderer.render_network_config(network_config)
  File "/usr/lib/python3/dist-packages/cloudinit/net/renderer.py", line 56, in render_network_config
    templates=templates, target=target)
  File "/usr/lib/python3/dist-packages/cloudinit/net/eni.py", line 494, in render_network_state
    util.write_file(fpeni, header + self._render_interfaces(network_state))
  File "/usr/lib/python3/dist-packages/cloudinit/net/eni.py", line 478, in _render_interfaces
    key=lambda k: (order[k['type']], k['name'])):
  File "/usr/lib/python3/dist-packages/cloudinit/net/eni.py", line 478, in <lambda>
    key=lambda k: (order[k['type']], k['name'])):
KeyError: 'infiniband'
------------------------------------------------------------

** Affects: cloud-init
     Importance: Undecided
     Assignee: Darren Birkett (darren-birkett)
         Status: New

** Attachment added: "cloud-init ubuntu-bug output"
   https://bugs.launchpad.net/bugs/1847114/+attachment/5295171/+files/apport.cloud-init.50le8j1m.apport

** Changed in: cloud-init
     Assignee: (unassigned) => Darren Birkett (darren-birkett)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1847114

Title:
  Infinibad network devices not configured correctly on ubuntu

Status in cloud-init:
  New

Bug description:
  In commit e7b0e5f72 (included in release 18.4), support was added for
  configuring infiniband network devices. This only works on centos
  (using the sysconfig renderer), and in testing on ubuntu the following
  issues were encountered (using the eni renderer) (logs pasted here for
  completeness):

  I think this will be a relatively trivial change to
  cloudinit/net/eni.py


  # dpkg -l cloud-init
  ii  cloud-init 19.2-36-g059d049c-0ubuntu1~18. all                            
   
  from cloud-init.log:
   
  2019-10-07 11:47:00,828 - util.py[WARNING]: failed stage init-local
  2019-10-07 11:47:00,828 - util.py[DEBUG]: failed stage init-local
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 653, in status_wrapper
      ret = functor(name, args)
    File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 362, in main_init
      init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
    File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 676, in apply_network_config
      netcfg, src = self._find_networking_config()
    File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 641, in _find_networking_config
      if self.datasource and hasattr(self.datasource, 'network_config'):
    File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceConfigDrive.py", line 152, in network_config
      self.network_json, known_macs=self.known_macs)
    File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/openstack.py", line 669, in convert_net_json
      raise ValueError("Unable to find a system nic for %s" % d)
  ValueError: Unable to find a system nic for {'type': 'physical', 'mtu': 9000, 'subnets': [{'type': 'static', 'netmask': '255.255.255.0', 'routes': [], 'address': '192.168.202.26', 'ipv4': True}], 'mac_address': 'aa:aa:aa:aa:aa:aa'}
   
   
  from stdout when cloud init run manually:
   
  root@iband1# cloud-init --debug init --local
  2019-10-07 12:38:09,527 - handlers.py[DEBUG]: start: init-local: searching for local datasources
  2019-10-07 12:38:09,527 - util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
  2019-10-07 12:38:09,527 - util.py[DEBUG]: Read 17 bytes from /proc/uptime
  2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /run/cloud-init/status.json
  2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /run/cloud-init/result.json
  2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/data/status.json
  2019-10-07 12:38:09,528 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/data/result.json
  2019-10-07 12:38:09,528 - atomic_helper.py[DEBUG]: Atomically writing to file /var/lib/cloud/data/status.json (via temporary file /var/lib/cloud/data/tmpmm2igkqk) - w: [644] 469 bytes/chars
  2019-10-07 12:38:09,528 - util.py[DEBUG]: Creating symbolic link from '/run/cloud-init/status.json' => '../../var/lib/cloud/data/status.json'
  2019-10-07 12:38:09,528 - util.py[DEBUG]: Running command ['systemd-detect-virt', '--quiet', '--container'] with allowed return codes [0] (shell=False, capture=True)
  2019-10-07 12:38:09,532 - util.py[DEBUG]: Running command ['running-in-container'] with allowed return codes [0] (shell=False, capture=True)
  2019-10-07 12:38:09,535 - util.py[DEBUG]: Running command ['lxc-is-container'] with allowed return codes [0] (shell=False, capture=True)
  2019-10-07 12:38:09,537 - util.py[DEBUG]: Reading from /proc/1/environ (quiet=False)
  2019-10-07 12:38:09,537 - util.py[DEBUG]: Read 187 bytes from /proc/1/environ
  2019-10-07 12:38:09,537 - util.py[DEBUG]: Reading from /proc/self/status (quiet=False)
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 1313 bytes from /proc/self/status
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Reading from /proc/cmdline (quiet=False)
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 126 bytes from /proc/cmdline
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 17 bytes from /proc/uptime
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg (quiet=False)
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Read 3169 bytes from /etc/cloud/cloud.cfg
  2019-10-07 12:38:09,538 - util.py[DEBUG]: Attempting to load yaml from string of length 3169 with allowed root types (<class 'dict'>,)
  2019-10-07 12:38:09,547 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg.d/90_dpkg.cfg (quiet=False)
  2019-10-07 12:38:09,547 - util.py[DEBUG]: Read 114 bytes from /etc/cloud/cloud.cfg.d/90_dpkg.cfg
  2019-10-07 12:38:09,547 - util.py[DEBUG]: Attempting to load yaml from string of length 114 with allowed root types (<class 'dict'>,)
  2019-10-07 12:38:09,548 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg.d/05_logging.cfg (quiet=False)
  2019-10-07 12:38:09,548 - util.py[DEBUG]: Read 2057 bytes from /etc/cloud/cloud.cfg.d/05_logging.cfg
  2019-10-07 12:38:09,548 - util.py[DEBUG]: Attempting to load yaml from string of length 2057 with allowed root types (<class 'dict'>,)
  2019-10-07 12:38:09,551 - util.py[DEBUG]: Reading from /run/cloud-init/cloud.cfg (quiet=False)
  2019-10-07 12:38:09,551 - util.py[DEBUG]: Read 39 bytes from /run/cloud-init/cloud.cfg
  2019-10-07 12:38:09,551 - util.py[DEBUG]: Attempting to load yaml from string of length 39 with allowed root types (<class 'dict'>,)
  2019-10-07 12:38:09,551 - util.py[DEBUG]: Attempting to load yaml from string of length 0 with allowed root types (<class 'dict'>,)
  2019-10-07 12:38:09,551 - util.py[DEBUG]: loaded blob returned None, returning default.
  2019-10-07 12:38:09,552 - util.py[DEBUG]: Redirecting <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'> to | tee -a /var/log/cloud-init-output.log
  2019-10-07 12:38:09,554 - util.py[DEBUG]: Redirecting <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'> to | tee -a /var/log/cloud-init-output.log
  2019-10-07 12:38:09,554 - main.py[DEBUG]: Logging being reset, this logger may no longer be active shortly
  Cloud-init v. 19.2-36-g059d049c-0ubuntu1~18.04.1 running 'init-local' at Mon, 07 Oct 2019 12:38:09 +0000. Up 3130.80 seconds.
  2019-10-07 12:38:09,684 - util.py[WARNING]: failed stage init-local
  failed run of stage init-local
  ------------------------------------------------------------
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 653, in status_wrapper
      ret = functor(name, args)
    File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 362, in main_init
      init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
    File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 706, in apply_network_config
      return self.distro.apply_network_config(netcfg, bring_up=bring_up)
    File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 178, in apply_network_config
      dev_names = self._write_network_config(netconfig)
    File "/usr/lib/python3/dist-packages/cloudinit/distros/debian.py", line 114, in _write_network_config
      return self._supported_write_network_config(netconfig)
    File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 93, in _supported_write_network_config
      renderer.render_network_config(network_config)
    File "/usr/lib/python3/dist-packages/cloudinit/net/renderer.py", line 56, in render_network_config
      templates=templates, target=target)
    File "/usr/lib/python3/dist-packages/cloudinit/net/eni.py", line 494, in render_network_state
      util.write_file(fpeni, header + self._render_interfaces(network_state))
    File "/usr/lib/python3/dist-packages/cloudinit/net/eni.py", line 478, in _render_interfaces
      key=lambda k: (order[k['type']], k['name'])):
    File "/usr/lib/python3/dist-packages/cloudinit/net/eni.py", line 478, in <lambda>
      key=lambda k: (order[k['type']], k['name'])):
  KeyError: 'infiniband'
  ------------------------------------------------------------

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1847114/+subscriptions


Follow ups