← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1629797] Re: resolve service in nsswitch.conf adds 25 seconds to failed lookups before systemd-resolved is up

 

** Also affects: dbus (Ubuntu Yakkety)
   Importance: Undecided
       Status: New

** Also affects: cloud-init (Ubuntu Yakkety)
   Importance: Undecided
       Status: New

** Also affects: dbus (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Also affects: cloud-init (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: cloud-init (Ubuntu Xenial)
       Status: New => Confirmed

** Changed in: cloud-init (Ubuntu Yakkety)
       Status: New => Confirmed

** Changed in: cloud-init (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: cloud-init (Ubuntu Yakkety)
   Importance: Undecided => Medium

** Changed in: cloud-init (Ubuntu Yakkety)
       Status: Confirmed => Fix Released

** Changed in: dbus (Ubuntu Xenial)
       Status: New => Invalid

** Changed in: dbus (Ubuntu Yakkety)
       Status: New => Invalid

** No longer affects: dbus (Ubuntu Xenial)

** No longer affects: dbus (Ubuntu Yakkety)

** Description changed:

- During boot, cloud-init does DNS resolution checks to if particular
- metadata services are available (in order to determine which cloud it is
- running on).  These checks happen before systemd-resolved is up[0] and
- if they resolve unsuccessfully they take 25 seconds to complete.
+ === Begin SRU Template ===
+ [Impact] 
+ In cases where cloud-init used dns during early boot and system was
+ configured in nsswitch.conf to use systemd-resolvd, the system would
+ timeout on dns attempts making system boot terribly slow.
+ 
+ [Test Case]
+ Boot a system on GCE.
+ check for WARN in /var/log/messages
+ check that time to boot is reasonable (<30 seconds).  In failure case the
+ times would be minutes.
+ 
+ [Regression Potential]
+ Changing order in boot can be dangerous.  There is real chance for 
+ regression here, but it should be fairly small as xenial does not include
+ systemd-resolved usage.  This was first noticed on yakkety where it did.
+ 
+ [Other Info]
+ It seems useful to SRU this in the event that systemd-resolvd is used
+ on 16.04 or the case where user upgrades components (admittedly small use
+ case).
+ 
+ === End SRU Template ===
+ 
+ 
+ 
+ During boot, cloud-init does DNS resolution checks to if particular metadata services are available (in order to determine which cloud it is running on).  These checks happen before systemd-resolved is up[0] and if they resolve unsuccessfully they take 25 seconds to complete.
  
  This has substantial impact on boot time in all contexts, because cloud-
  init attempts to resolve three known-invalid addresses ("does-not-
  exist.example.com.", "example.invalid." and a random string) to enable
  it to detect when it's running in an environment where a DNS server will
  always return some sort of redirect.  As such, we're talking a minimum
  impact of 75 seconds in all environments.  This increases when cloud-
  init is configured to check for multiple environments.
  
  This means that yakkety is consistently taking 2-3 minutes to boot on
  EC2 and GCE, compared to the ~30 seconds of the first boot and ~10
  seconds thereafter in xenial.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1629797

Title:
  resolve service in nsswitch.conf adds 25 seconds to failed lookups
  before systemd-resolved is up

Status in cloud-init:
  Fix Committed
Status in D-Bus:
  Unknown
Status in cloud-init package in Ubuntu:
  Fix Released
Status in dbus package in Ubuntu:
  Won't Fix
Status in cloud-init source package in Xenial:
  Confirmed
Status in cloud-init source package in Yakkety:
  Fix Released

Bug description:
  === Begin SRU Template ===
  [Impact] 
  In cases where cloud-init used dns during early boot and system was
  configured in nsswitch.conf to use systemd-resolvd, the system would
  timeout on dns attempts making system boot terribly slow.

  [Test Case]
  Boot a system on GCE.
  check for WARN in /var/log/messages
  check that time to boot is reasonable (<30 seconds).  In failure case the
  times would be minutes.

  [Regression Potential]
  Changing order in boot can be dangerous.  There is real chance for 
  regression here, but it should be fairly small as xenial does not include
  systemd-resolved usage.  This was first noticed on yakkety where it did.

  [Other Info]
  It seems useful to SRU this in the event that systemd-resolvd is used
  on 16.04 or the case where user upgrades components (admittedly small use
  case).

  === End SRU Template ===


  
  During boot, cloud-init does DNS resolution checks to if particular metadata services are available (in order to determine which cloud it is running on).  These checks happen before systemd-resolved is up[0] and if they resolve unsuccessfully they take 25 seconds to complete.

  This has substantial impact on boot time in all contexts, because
  cloud-init attempts to resolve three known-invalid addresses ("does-
  not-exist.example.com.", "example.invalid." and a random string) to
  enable it to detect when it's running in an environment where a DNS
  server will always return some sort of redirect.  As such, we're
  talking a minimum impact of 75 seconds in all environments.  This
  increases when cloud-init is configured to check for multiple
  environments.

  This means that yakkety is consistently taking 2-3 minutes to boot on
  EC2 and GCE, compared to the ~30 seconds of the first boot and ~10
  seconds thereafter in xenial.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1629797/+subscriptions