← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1735331] Re: ec2: zesty tempfile sandbox dhclient.pid file can't be created

 

This bug is believed to be fixed in cloud-init in 1705804. If this is
still a problem for you, please make a comment and set the state back to
New

Thank you.

** Changed in: cloud-init
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1735331

Title:
  ec2: zesty tempfile sandbox dhclient.pid file can't be created

Status in cloud-init:
  Fix Released
Status in cloud-init package in Ubuntu:
  Fix Released
Status in cloud-init source package in Xenial:
  Fix Released
Status in cloud-init source package in Zesty:
  Fix Released
Status in cloud-init source package in Artful:
  Fix Released
Status in cloud-init source package in Bionic:
  Fix Released

Bug description:
  === Begin SRU Template ===
  [Impact]
  Ec2 instances could hit race condition with tempdir removal where dhclient doesn't write a pidfile and DataSourceEc2Local hits a traceback trying to read that non-existent pidfile. This traceback causes the instance to fallback and get discovered in init-network stage as DataSourceEc2. The thrashing costs instances a couple extra seconds to boot while re-discovering in a different stage.

  [Test Case]

  # Launch instance under test
  $ for release in xenial zesty artful; do
      echo "Handling $release";
      launch-ec2 --series $release;
      ssh ubuntu@<ec2-address> cat /run/cloud-init/result.json;
      ssh ubuntu@<ec2-address> grep Trace /var/log/cloud-init.log;
      ssh ubuntu@<ec2-address> sudo sed 's/ $release / $release-proposed /' /etc/apt/sources.list;
      ssh ubuntu@<ec2-address> sudo apt update;
      ssh ubuntu@<ec2-address> sudo apt install cloud-init;
      # Show upgrade without restart doesn't break
      ssh ubuntu@<ec2-address> sudo cloud-init init;
      # Show clean install doesn't break
      ssh ubuntu@<ec2-address> 'sudo rm -rf /var/log/cloud-init* /var/lib/cloud; sudo reboot'
      ssh ubuntu@<ec2-address> 'sudo cat /run/cloud-init/result.json
      ssh ubuntu@<ec2-address> 'sudo grep Trace /var/log/cloud-init*';
      # Asssert no intermittent tracebacks from dhcp_discovery and no leaked dhcpclients;
      ssh ubuntu@<ec2-address> "sudo python3 -c 'from cloudinit.net.dhcp import maybe_perform_dhcp_discovery; maybe_perform_dhcp_discovery()";
      sudo ps -afe |grep dhclient;
    done

  [Regression Potential]
  Regression would still result in Tracebacks in DataSourceEc2Local which would cause cloud-init to fallback to DataSourceEc2 in init-network stage.

  [Other Info]
  Upstream commit at
    https://git.launchpad.net/cloud-init/commit/?id=7acc9e68f
  === End SRU Template ===

  === Original Description ===

  Saw an issue once on EC2 zesty image with 17.1.41 during SRU testing.

  Looks like we hit an inability to create the pid file (from syslog)

  #### syslog
  Nov 30 04:20:35 ip-10-0-20-176 cloud-init[440]: Cloud-init v. 17.1 running 'init-local' at Thu, 30 Nov 2017 04:20:32 +0000. Up 7.16 seconds.
  Nov 30 04:20:35 ip-10-0-20-176 cloud-init[440]: 2017-11-30 04:20:32,768 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceEc2.DataSourceEc2Local'> failed
  Nov 30 04:20:35 ip-10-0-20-176 dhclient[669]: Can't create /var/tmp/cloud-init/cloud-init-dhcp-hnatdvwi/dhclient.pid: No such file or directory

  #### end syslog

  A traceback when trying to read the temporary pid file that was
  created  by our dhclient run during Ec2Local setup. Maybe we exited
  out of the dhcp run before we could read the pid file?

  ...
  2017-11-30 04:20:32,738 - util.py[DEBUG]: Running command ['ip', 'link', 'set', 'dev', 'eth0', 'up'] with allowed return codes [0] (shell=False, capture=True)
  2017-11-30 04:20:32,744 - util.py[DEBUG]: Running command ['/var/tmp/cloud-init/cloud-init-dhcp-hnatdvwi/dhclient', '-1', '-v', '-lf', '/var/tmp/cloud-init/cloud-init-dhcp-hnatdvwi/dhcp.leases', '-pf', '/var/tmp/cloud-init/cloud-init-dhcp-hnatdvwi/dhclient.pid', 'eth0', '-sf', '/bin/true'] with allowed return codes [0] (shell=False, capture=True)
  2017-11-30 04:20:32,768 - util.py[DEBUG]: Reading from /var/tmp/cloud-init/cloud-init-dhcp-hnatdvwi/dhclient.pid (quiet=False)
  2017-11-30 04:20:32,768 - handlers.py[DEBUG]: finish: init-local/search-Ec2Local: FAIL: no local data found from DataSourceEc2Local
  2017-11-30 04:20:32,768 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceEc2.DataSourceEc2Local'> failed
  2017-11-30 04:20:32,768 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceEc2.DataSourceEc2Local'> failed
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 332, in find_source
      if s.get_data():
    File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceEc2.py", line 378, in get_data
      return super(DataSourceEc2Local, self).get_data()
    File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceEc2.py", line 100, in get_data
      self.fallback_interface)
    File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in maybe_perform_dhcp_discovery
      return dhcp_discovery(dhclient_path, nic, tdir)
    File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 124, in dhcp_discovery
      pid = int(util.load_file(pid_file).strip())
    File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1257, in load_file
      with open(fname, 'rb') as ifh:
  FileNotFoundError: [Errno 2] No such file or directory: '/var/tmp/cloud-init/cloud-init-dhcp-hnatdvwi/dhclient.pid'

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1735331/+subscriptions


References