← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1911067] [NEW] Several containers stuck in Pending with cloud-init failing to start

 

Public bug reported:

Run here:
https://solutions.qa.canonical.com/testruns/testRun/4a343cb3-2b6e-44b7-8aa0-a7bf569514f0

Logs/artifacts here:
https://oil-jenkins.canonical.com/artifacts/4a343cb3-2b6e-44b7-8aa0-a7bf569514f0/index.html

---

- We (solutions-qa) hit this in a handful of runs over the weekend. A
few of the containers get stuck in "Pending".

- It doesn't appear to be a Juju application issue as there is no single
consistent application being deployed to the containers that share the
"Pending".

- In the crashdump at I'm seeing the msg:

/var/log/lxd/$JUJU_INSTANCE_NAME/console.log for the baremetal logs of the machine that has "Pending" containers
[FAILED] Failed to start Initial cloud-init job (metadata service crawler)

- We're using a Level 2 CIS Hardened image. It could make sense that
something was cutting off its ability to go and make a network call near
the beginning of its run. But if that was the case, it seems like all of
the containers would fail to come up.

---

I'm going to work on reproducing this manually and will update this bug
with any new info I find.

** Affects: cloud-init
     Importance: Undecided
         Status: New


** Tags: cdo-qa foundations-engine

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1911067

Title:
  Several containers stuck in Pending with cloud-init failing to start

Status in cloud-init:
  New

Bug description:
  Run here:
  https://solutions.qa.canonical.com/testruns/testRun/4a343cb3-2b6e-44b7-8aa0-a7bf569514f0

  Logs/artifacts here:
  https://oil-jenkins.canonical.com/artifacts/4a343cb3-2b6e-44b7-8aa0-a7bf569514f0/index.html

  ---

  - We (solutions-qa) hit this in a handful of runs over the weekend. A
  few of the containers get stuck in "Pending".

  - It doesn't appear to be a Juju application issue as there is no
  single consistent application being deployed to the containers that
  share the "Pending".

  - In the crashdump at I'm seeing the msg:

  /var/log/lxd/$JUJU_INSTANCE_NAME/console.log for the baremetal logs of the machine that has "Pending" containers
  [FAILED] Failed to start Initial cloud-init job (metadata service crawler)

  - We're using a Level 2 CIS Hardened image. It could make sense that
  something was cutting off its ability to go and make a network call
  near the beginning of its run. But if that was the case, it seems like
  all of the containers would fail to come up.

  ---

  I'm going to work on reproducing this manually and will update this
  bug with any new info I find.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1911067/+subscriptions


Follow ups