yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #13351
[Bug 1303925] Re: commissioning fails silently if a node can't reach the region controller
cloud-init is executing code that maas told it to execute.
so maas needs to tell it to execute code that has some "last ditch catch".
to be clear, cloud-init got data from maas (via kernel cmdline) that
told it to tell get some code from the metadata server to execute. It
then executed it. That code failed. *that* is the code that needs to
be more resilient. cloud-init is, by design, very much doing exactly
what maas tells it to do.
** No longer affects: cloud-init
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1303925
Title:
commissioning fails silently if a node can't reach the region
controller
Status in MAAS:
Triaged
Bug description:
We recently had a node which completely refused to commission in MAAS.
After (literally) several man days of debugging, we figured out that
it was because the node couldn't talk to the region controller over
HTTP.
Obviously, that's ultimately our mistake/problem, but MAAS could have
been a lot better at helping us to help ourselves; currently, there's
absolutely no indication from the boot process that the HTTP
connection to the region controller is the problem.
Attached is the serial console output (from the point of boot) for the
node that was failing to commission. 91.189.94.35 is the MAAS region
controller and 91.189.88.20 is the MAAS cluster controller.
To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1303925/+subscriptions