← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1958280] Re: Networking failures after NIC reordering

 

Tracked in Github Issues as https://github.com/canonical/cloud-
init/issues/3939

** Bug watch added: github.com/canonical/cloud-init/issues #3939
   https://github.com/canonical/cloud-init/issues/3939

** Changed in: cloud-init
       Status: Triaged => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1958280

Title:
  Networking failures after NIC reordering

Status in cloud-init:
  Expired
Status in netplan:
  Incomplete

Bug description:
  We can reliably reproduce a case where network configuration changes
  for an Ubuntu 20.04 VM results in a networkd hanging on "pending"
  interfaces. The interfaces are pending because of conflicts in naming
  from the current boot and that found in /etc/netplan/50-cloud-
  init.yaml from previous boot

  Specifically, the netplan generator applies the previous configuration's names prior to running cloud-init local.  We'll see something like `systemd-udevd[228]: 
  eth0: Failed to process device, ignoring: File exists`.

  In one scenario, the data source is able to fetch updated network
  configuration, and  cloud-init updates the config & udev rules just
  fine.  However, networking stays offline ("pending") indefinitely.  It
  can be forced to resolve by executing `sudo udevadm trigger --attr-
  match=subsystem=net`.

  Example: Create a VM on Azure with two NICs, re-order them, then
  restart.

  az vm create --name test-x1 --image Canonical:0001-com-ubuntu-server-focal:20_04-lts:latest --nics test-nic-01 test-nic-02
  az vm deallocate --name test-x1
  az vm nics set --vm-name test-x1 --nics test-nic-02 test-nic-01
  az vm start --name test-x1

  Upon doing that I am unable to login via serial console for 20 minutes
  until cloud init times out.  In this case, Azure is trying to report
  ready but cannot because system networking never came up. We can
  remove /lib/systemd/system/cloud-init-local.service.d/50-azure-clear-
  persistent-obj-pkl.conf, cloud-init doesn't hang the boot, but
  networking still fails to initialize for the guest.

  The behavior for 18.04 is a bit different. On 18.04, the renaming of
  the interfaces succeeds at early boot, which instead results in the
  Azure data source failing the local phase because the
  fallback_interface is no longer the primary NIC (eth1 secondary was
  renamed to eth0 to match previous boot's config).

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1958280/+subscriptions



References