← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2006106] Re: cloud-init overriding set-name in netplan file

 

Tracked in Github Issues as https://github.com/canonical/cloud-
init/issues/4074

** Bug watch added: github.com/canonical/cloud-init/issues #4074
   https://github.com/canonical/cloud-init/issues/4074

** Changed in: cloud-init
       Status: Triaged => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/2006106

Title:
  cloud-init overriding set-name in netplan file

Status in cloud-init:
  Expired

Bug description:
  After creating an Ubuntu 22.04 instance in OpenStack the following netplan file is generated:
  ```
  # cat /etc/netplan/50-cloud-init.yaml
  # This file is generated from information provided by the datasource.  Changes
  # to it will not persist across an instance reboot.  To disable cloud-init's
  # network configuration capabilities, write a file
  # /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  # network: {config: disabled}
  network:
      version: 2
      ethernets:
          ens3:
              accept-ra: true
              dhcp4: true
              dhcp6: true
              match:
                  macaddress: fa:16:3e:c7:f9:7e
              mtu: 1500
              set-name: ens3
  ```

  With the matching links:
  ```
  # ip -br l
  lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
  ens3             UP             fa:16:3e:c7:f9:7e <BROADCAST,MULTICAST,UP,LOWER_UP>
  ```

  I was then trying to rename the interface from "ens3" to "eth0", updating the file like so:
  ```
  # cat /etc/netplan/50-cloud-init.yaml
  # This file is generated from information provided by the datasource.  Changes
  # to it will not persist across an instance reboot.  To disable cloud-init's
  # network configuration capabilities, write a file
  # /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
  # network: {config: disabled}
  network:
      version: 2
      ethernets:
          eth0:
              accept-ra: true
              dhcp4: true
              dhcp6: true
              match:
                  macaddress: fa:16:3e:c7:f9:7e
              mtu: 1500
              set-name: eth0
  ```

  Applying the config works, the interface is renamed without dropping my SSH connection:
  ```
  # netplan apply

  # ip -br l
  lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
  eth0             UP             fa:16:3e:c7:f9:7e <BROADCAST,MULTICAST,UP,LOWER_UP>
  ```

  So far so good, but now I reboot the machine, and it will not come back online:
  ```
  # reboot
  Connection to XXX.XXX.XXX.XXX closed by remote host.
  Connection to XXX.XXX.XXX.XXX closed.
  ```

  Logging in via a locally connected console I can see the following:
  ```
  # ip -br l
  lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
  ens3             DOWN           fa:16:3e:c7:f9:7e <BROADCAST,MULTICAST>
  ```

  So for some reason the interface comes up as "ens3" again, also it has
  no address configuration assigned which is the reason I can not reach
  it. If I then run a manual "netplan apply" I can get it online again:

  ```
  # netplan apply
  # ip -br l
  lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
  eth0             UP             fa:16:3e:c7:f9:7e <BROADCAST,MULTICAST,UP,LOWER_UP>
  ```

  Now logged in over SSH again checking the dmesg log for renames the following can be seen:
  ```
  # dmesg | grep rename
  [    2.142770] virtio_net virtio0 ens3: renamed from eth0
  [    6.089816] virtio_net virtio0 eth0: renamed from ens3
  [    7.253661] virtio_net virtio0 ens3: renamed from eth0
  [  278.607558] virtio_net virtio0 eth0: renamed from ens3
  ```

  So the network name has been flapping back and forth between "ens3"
  and "eth0".

  After digging around I think this is what happens:
  ```
  [    2.142770] virtio_net virtio0 ens3: renamed from eth0 <- systemd-networkd, as part of initramfs
  [    6.089816] virtio_net virtio0 eth0: renamed from ens3 <- systemd-networkd, as part of booted OS, using the files generated by my initial "netplan apply".
  [    7.253661] virtio_net virtio0 ens3: renamed from eth0 <- cloud-init, for some reason
  [  278.607558] virtio_net virtio0 eth0: renamed from ens3 <- my manual "netplan apply" after logging in to the console 
  ```

  Looking at /var/log/cloud-init.log the following message is seen:
  ```
  2023-02-06 07:57:27,270 - __init__.py[DEBUG]: Detected interfaces {'eth0': {'downable': True, 'device_id': '0x0001', 'driver': 'virtio_net', 'mac': 'fa:16:3e:c7:f9:7e', 'name': 'eth0', 'up': False}, 'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:00:00', 'name': 'lo', 'up': True}}
  2023-02-06 07:57:27,270 - __init__.py[DEBUG]: achieving renaming of [['fa:16:3e:c7:f9:7e', 'ens3', None, None]] with ops [('rename', 'fa:16:3e:c7:f9:7e', 'ens3', ('eth0', 'ens3'))]
  2023-02-06 07:57:27,270 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth0', 'name', 'ens3'] with allowed return codes [0] (shell=False, capture=True)
  ```

  I had a hard time understanding how cloud-init knew about the previous "ens3" name initially, but now I think this has been persisted in the obj.pkl at initial install time boot and is now picked up on subsequent boots, from that same log:
  ```
  2023-02-06 07:57:27,211 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
  ```

  Taking a look in the file:
  ```
  # cat p.py
  #!/usr/bin/env python3

  import pickle

  # open a file, where you stored the pickled data
  with open('/var/lib/cloud/instance/obj.pkl', 'rb') as file:
      data = pickle.load(file)

  print(data.network_config)
  ```

  ```
  # ./p.py
  {'version': 1, 'config': [{'mtu': 1500, 'type': 'physical', 'accept-ra': True, 'subnets': [{'type': 'dhcp4'}, {'type': 'dhcp6'}], 'mac_address': 'fa:16:3e:c7:f9:7e', 'name': 'ens3'}, {'type': 'nameserver', 'address': 'XXX.XXX.XXX.XXX'}, {'type': 'nameserver', 'address': 'YYYY:YYYY:YYYY::YYYY:YYYY:YYYY'}]}
  ```

  From what I can tell this "name" is picked up in the openstack helper
  at https://github.com/canonical/cloud-
  init/blob/483f79cb3b94c8c7d176e748892a040c71132cb3/cloudinit/sources/helpers/openstack.py#L715

  So... the question then is, how should this work? Right now it seems
  cloud-init is helping me with a rename even if I have asked the
  netplan file to set another name than the machine had at initial
  install.

  One thing that occured to me is that maybe I am expected to feed
  cloud-init user-data so it can know initially that I want the
  interface called "eth0", but reading
  https://cloudinit.readthedocs.io/en/22.4.2/topics/network-config.html
  it states "User-data cannot change an instance’s network
  configuration." so it seems this is not expected behaviour.

  For now I guess the simplest workaround is to just disable the network management parts as mentioned in the generated netplan file, this works:
  ```
  # echo "network: {config: disabled}" > /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
  # reboot
  ```

  Now the machine comes up by itself, and there are less renames happening:
  ```
  # dmesg | grep rename
  [    2.165152] virtio_net virtio0 ens3: renamed from eth0
  [    6.108291] virtio_net virtio0 eth0: renamed from ens3
  ```

  It feels strange to have to disable the network management parts...
  What would be the correct way to deal with this situation?

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/2006106/+subscriptions



References