← Back to team overview

launchpad-dev team mailing list archive

Re: rollout changes to reduce downtime - linked to 'release features when they are ready'

 

On Tue, Jul 20, 2010 at 7:25 AM, Michael Hudson
<michael.hudson@xxxxxxxxxxxxx> wrote:
> On 20/07/10 16:39, Robert Collins wrote:
>>
>> Ok, so the answer may be 'we interrupt those jobs when we're ready?
>
> Yes, that's probably reasonable for the import case.

If thats fine, then the story for the importd's can be:
a) do all the rest of the upgrade
b) nuke em
c) deploy
d) start em

>>> An approach where you installed the new code at a new path and didn't
>>> delete
>>> the old code until all jobs running from that tree finished would work
>>> fine.
>>>  I don't know how you tell all jobs running from a particular tree are
>>> finished though.
>>
>> Can we change the code to make that clear somehow?
>
> I can't think of anything tasteful right now.  Do you have any ideas?

put the working dir in the commandline? then ps can tell us the dir it
started from?

> It occurs to me that the codehosting server has a slightly similar issue;
> you want to shut the old server down when its last connection closes.  This
> is probably a bit easier though (the load balancer might be able tell you,
> or you can change the state of the ssh server through some control socket).

Yes, the ssh connection should be clear enough.

>> My understanding from James Troup is that the slaves go boom when the
>> tcp socket closes - I've filed a bug about this though.
>
> I find this a bit tricky to believe in general.  The manager talks xml-rpc
> to the slaves, so there should be no persistent connection in general (even
> if we're using pipelining by some perverse miracle, it shouldn't matter if
> the socket closes).  I can believe that losing the manager at an arbitrary
> time would be bad, but exiting between scans should be fine.

Sure, as I say, its hearsay. Oh, and its a feature.

>> Thanks for the feedback, its excellent to know a bit more about how
>> things are actively deployed. It sounds like there might be a code
>> change needed to make the importds easier to manage
>> transitions-of-code, perhaps you could file that?
>
> Let's have one more round of waffle first ;-)

/me waffles.

_Rob



References