← Back to team overview

maas-devel team mailing list archive

Re: Node state and pserv

 

Hi jtv,

I am largely in favour of this.  I'd go even further and say that the only ops 
that need to talk to the pserv/cobbler axis are start and stop, because they 
can send every detail necessary in the one call and we make pserv just 
overwrite anything in cobbler.

I am also heavily in favour of making this completely asynchronous so that 
appserver threads are not blocked - we are already running into trouble with 
this old synchronous approach 
(https://bugs.launchpad.net/ubuntu/+source/maas/+bug/989355).

You talk about nodes that "pserv knows about" - this concerns me a bit because 
pserv needs to be stateless so that we can scale it easily.  What do you mean 
exactly by it knowing about nodes?  Right now we can't avoid that cobbler has 
state, but we can overwrite it evey time we talk to it so that the MAAS data 
is always authoritative.

Bear in mind that we are going to remove cobbler very soon.  Anything we 
change here needs to be compatible with that and I do believe that your 
suggestions leave us better off in the pusuit of that goal.

Cheers.

On Tuesday 24 April 2012 13:49:41 Jeroen Vermeulen wrote:
> Hi all,
> 
> I was thinking the time may have come to streamline how maasserver and
> pserv communicate about nodes.
> 
> Right now we have a function hooked into maasserver's Node.save() that
> updates pserv (and thus, Cobbler) and blocks for that to complete.  We
> always wanted to try the simplest thing that could work, see how we'd
> end up driving Cobbler in practice, and then refactor based on that
> knowledge.  We wanted the resulting protocol to be simple, terse, as
> asynchronous as feasible, untainted by Cobbler details, and light in state.
> 
> Now, looking at the pserv API, I see that we only have two real
> operations on a node: start it, and stop it.  Everything else is state
> manipulation.
> 
> And so my question is: what state manipulations are useful in and of
> themselves, as opposed to as preparation for starting or stopping a
> node?  What does Cobbler really need to know about a node when it's not
> being asked to start or stop the thing?  My imagination stops at its
> DHCP and DNS properties (and as the UI stands, that means only changing
> its hostname).
> 
> If that's more or less correct, I'd like to reform the nodes part of the
> pserv API to the following operations.  It may seem wasteful in terms of
> how much goes over the wire per request, but it's also less chatty.
> Plus it would allow a future pserv implementation to be a borderline
> amnesiac, apart from DHCP and DNS state and what's needed to walk a node
> through procedures like commissioning.
> 
>   * Start nodes.
> 
> Asynchronous.  Takes the nodes' full definitions as arguments:
> networking properties, profile, power method, netboot setting.  Each
> node might be new to pserv, or already known; it'd be entirely up to
> pserv to figure that out.
> 
>   * Stop nodes.
> 
> Likewise asynchronous.  This would again take full node definitions.
> Some of the nodes being started may be new to pserv.  The netboot
> setting here applies only to the case where somebody turns on the node
> by hand; the next time MAAS starts the node it will specify the netboot
> behaviour it wants at that point.
> 
>   * Set host's DHCP/DNS properties.
> 
> This looks to me like a completely separate thing from all the others,
> although of course pserv may also have to run through this internally
> when starting or stopping a node.
> 
> It'd be up to pserv to figure out whether it's updating a node it knows
> about, or learning about a new one.  For example, if the host is new,
> pserv could pick an arbitrary profile — the next “start” call will
> provide the correct one when the time comes.
> 
> Since reconfiguring running services could be expensive, we may need to
> treat that in a similar way to node commissioning timeouts: leave some
> time for a burst of requests to pile up, then come back a little while
> later and service all pending changes at once.  Starting and stopping
> nodes are already asynchronous so no worries if they need to wait for a
> DHCP/DNS sync, but we need to keep that sync delay low enough so as not
> to give rise to artificial commissioning timeouts.
> 
>   * Delete host's DHCP/DNS properties.
> 
> Maybe.  Or maybe we consider this an offline garbage-collection process.
>   Meh.
> 
> 
> And that's the list.  Your average node change (status changes in
> particular) would not directly result in pserv interaction.  They might
> trigger the maasserver into making further requests, such as “good, now
> boot this node.”  But those events are inherently limited in frequency.
> 
> 
> …Comments?
> 
> 
> Jeroen


Follow ups

References