← Back to team overview

sts-sponsors team mailing list archive

[Bug 1872813] Re: cloud-init fails to detect iSCSI root on focal Oracle instances

 

** Description changed:

+ [Impact]
+ 
+ When creating a bare metal instance on Oracle Cloud (which are backed by an iscsi disk), the IP address is configured on an interface (enp45s0f0) on boot, but cloud-init is generating a /etc/netplan/50-cloud-init.yaml with an entry to configure enp12s0f0 using dhcp. As a result, enp12s0f0 will send a DHCPREQUEST and wait for a reply until it times out, delaying the boot process, as there's no dhcp server serving this interface. 
+ This is caused by a missing /run/initramfs/open-iscsi.interface that should point to the enp45s0f0 interface
+ 
+ 
+ [Fix]
+ 
+ There is a script from the open-iscsi package that checks if there are
+ no iscsi disks present and if there are no disks removes the
+ /run/initramfs/open-iscsi.interface file that stores the interface where
+ the iscsi disk is present.
+ 
+ This script originally runs along the local-top initrd scripts but uses
+ the /dev/disk/by-path/ path to find if there are iscsi discs present.
+ This path does not yet exists when the local-top scripts are run so the
+ file is always removed.
+ 
+ This was fixed in Focal by moving the script to run along the local-
+ bottom scripts. When these scripts run the /dev/disk/by-path/ path
+ exists.
+ 
+ 
+ [Test Plan]
+ 
+ This can be reproduced by instancing any bare metal instance on Oracle
+ Cloud (all are backed by an iscsi disk) and checking if the
+ /run/initramfs/open-iscsi.interface file is present.
+ 
+ 
+ [Where problems could occur]
+ 
+ There should be no problems as the script runs anyway but later into the
+ boot process.
+ 
+ If the script fails to run it could leave the open-iscsi.interface file
+ present with no iscsi drives but that should cause no issues besides
+ delaying the boot process.
+ 
+ 
+ [Original description]
+ 
  Currently focal images on Oracle are failing to get data from the Oracle
  DS with this traceback:
  
  Traceback (most recent call last):
-   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 772, in find_source
-     if s.update_metadata([EventType.BOOT_NEW_INSTANCE]):
-   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 661, in update_metadata
-     result = self.get_data()
-   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 279, in get_data
-     return_value = self._get_data()
-   File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOracle.py", line 195, in _get_data
-     with dhcp.EphemeralDHCPv4(net.find_fallback_nic()):
-   File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in __enter__
-     return self.obtain_lease()
-   File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 109, in obtain_lease
-     ephipv4.__enter__()
-   File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1019, in __enter__
-     self._bringup_static_routes()
-   File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1071, in _bringup_static_routes
-     util.subp(
-   File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2084, in subp
-     raise ProcessExecutionError(stdout=out, stderr=err,
+   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 772, in find_source
+     if s.update_metadata([EventType.BOOT_NEW_INSTANCE]):
+   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 661, in update_metadata
+     result = self.get_data()
+   File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 279, in get_data
+     return_value = self._get_data()
+   File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOracle.py", line 195, in _get_data
+     with dhcp.EphemeralDHCPv4(net.find_fallback_nic()):
+   File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in __enter__
+     return self.obtain_lease()
+   File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 109, in obtain_lease
+     ephipv4.__enter__()
+   File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1019, in __enter__
+     self._bringup_static_routes()
+   File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1071, in _bringup_static_routes
+     util.subp(
+   File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2084, in subp
+     raise ProcessExecutionError(stdout=out, stderr=err,
  cloudinit.util.ProcessExecutionError: Unexpected error while running command.
  Command: ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3']
  Exit code: 2
  Reason: -
- Stdout: 
+ Stdout:
  Stderr: RTNETLINK answers: File exists
  
- 
- In https://github.com/canonical/cloud-init/blob/46cf23c28812d3e3ba0c570defd9a05628af5556/cloudinit/sources/DataSourceOracle.py#L194-L198, we can see that this path is only taken if _is_iscsi_root returns False.
+ In https://github.com/canonical/cloud-
+ init/blob/46cf23c28812d3e3ba0c570defd9a05628af5556/cloudinit/sources/DataSourceOracle.py#L194-L198,
+ we can see that this path is only taken if _is_iscsi_root returns False.

-- 
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1872813

Title:
  cloud-init fails to detect iSCSI root on focal Oracle instances

Status in cloud-init:
  Invalid
Status in open-iscsi package in Ubuntu:
  Fix Released
Status in open-iscsi source package in Bionic:
  In Progress
Status in open-iscsi source package in Focal:
  Fix Released

Bug description:
  [Impact]

  When creating a bare metal instance on Oracle Cloud (which are backed by an iscsi disk), the IP address is configured on an interface (enp45s0f0) on boot, but cloud-init is generating a /etc/netplan/50-cloud-init.yaml with an entry to configure enp12s0f0 using dhcp. As a result, enp12s0f0 will send a DHCPREQUEST and wait for a reply until it times out, delaying the boot process, as there's no dhcp server serving this interface. 
  This is caused by a missing /run/initramfs/open-iscsi.interface that should point to the enp45s0f0 interface

  
  [Fix]

  There is a script from the open-iscsi package that checks if there are
  no iscsi disks present and if there are no disks removes the
  /run/initramfs/open-iscsi.interface file that stores the interface
  where the iscsi disk is present.

  This script originally runs along the local-top initrd scripts but
  uses the /dev/disk/by-path/ path to find if there are iscsi discs
  present. This path does not yet exists when the local-top scripts are
  run so the file is always removed.

  This was fixed in Focal by moving the script to run along the local-
  bottom scripts. When these scripts run the /dev/disk/by-path/ path
  exists.

  
  [Test Plan]

  This can be reproduced by instancing any bare metal instance on Oracle
  Cloud (all are backed by an iscsi disk) and checking if the
  /run/initramfs/open-iscsi.interface file is present.

  
  [Where problems could occur]

  There should be no problems as the script runs anyway but later into
  the boot process.

  If the script fails to run it could leave the open-iscsi.interface
  file present with no iscsi drives but that should cause no issues
  besides delaying the boot process.

  
  [Original description]

  Currently focal images on Oracle are failing to get data from the
  Oracle DS with this traceback:

  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 772, in find_source
      if s.update_metadata([EventType.BOOT_NEW_INSTANCE]):
    File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 661, in update_metadata
      result = self.get_data()
    File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 279, in get_data
      return_value = self._get_data()
    File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOracle.py", line 195, in _get_data
      with dhcp.EphemeralDHCPv4(net.find_fallback_nic()):
    File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in __enter__
      return self.obtain_lease()
    File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 109, in obtain_lease
      ephipv4.__enter__()
    File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1019, in __enter__
      self._bringup_static_routes()
    File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1071, in _bringup_static_routes
      util.subp(
    File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2084, in subp
      raise ProcessExecutionError(stdout=out, stderr=err,
  cloudinit.util.ProcessExecutionError: Unexpected error while running command.
  Command: ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3']
  Exit code: 2
  Reason: -
  Stdout:
  Stderr: RTNETLINK answers: File exists

  In https://github.com/canonical/cloud-
  init/blob/46cf23c28812d3e3ba0c570defd9a05628af5556/cloudinit/sources/DataSourceOracle.py#L194-L198,
  we can see that this path is only taken if _is_iscsi_root returns
  False.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1872813/+subscriptions