maas-devel team mailing list archive
-
maas-devel team
-
Mailing list archive
-
Message #01825
Re: 1.6.0 beta 4 out
On Friday 11 Jul 2014 19:27:09 Andreas Hasenack wrote:
> > I upgraded another MAAS 1.5.2 to this beta4 and here are my observations:
Thank you for helping with the testing Andreas.
> > - apparently one needs to import boot images again. Machines were just not
> > booting correctly, getting iscsi errors, images not found, etc. I clicked
> > on the import images button, waited for a while (had to use a network
> > monitor to know something was going on) and eventually had the magic
> > number
> > for boot images available (129). Then booting worked again on my nodes.
You should not have to re-import again, this is a bug [1] (one which I thought
was fixed as this is a known problem).
[1] https://bugs.launchpad.net/maas/+bug/1338690
Raphael, you QAed this, did you get as far as booting nodes with the QA?
> > - some nodes that were already in the ready state before the upgrade were
> > still booting with the old IP after the upgrade, instead of the new static
> > range. Do we need to clear out the leases file or something? Or trigger a
> > DNS rewrite by changing the hostname temporarily? No nodes were in use
> > when
> > I did the upgrade: all were just "ready".
> >
> > I'm still debugging this last issue in a node. It boots on a 10.96.1.x
> > address (old) but dns already points to the new 10.96.5.x one (from the
> > static range). I recomissioned it again, and now it doesn't even have a
> > dns
> > entry anymore.
Commissioning nodes don't get DNS entries any more.
> I gave up and nuked the leases file. That seemed to have solved the
> problems. All machines got an ip from the static range now. I was having
> issues before in 1.5.2 with CNAMEs disappearing, and other symptoms that
> were pointing at that bug about dhcp IP conflicts, that's what prompted the
> upgrade to maas 1.6. Maybe upgrading while in that state is what caused
> this, or rather, didn't fix it.
This is caused by the DHCP server still having an old active lease for the
node (if you can confirm this that would be useful). This can happen if a
node is just powered off rather than shutting down (which releases the lease).
It's a bit of a tricky problem to solve because although re-writing the leases
is possible by sending a signal to the DHCP server, that won't necessarily
help. As you notice, the only way around it is to clear the leases file out.
I think it would just start working after the lease times out.
We probably need to add a migration in the cluster to purge active leases for
READY nodes, unless anyone has a better idea?
J
Follow ups
References