← Back to team overview

launchpad-dev team mailing list archive

Re: Proof-of-concept 'forking' lp-serve

 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


...
>>
>> My basic proof of concept seems to be working, but I'm a bit stalled as
>> to how to get from PoC to having it landed and running on
>> Codebrowse. As
> 
> FWIW, in Launchpad jargon, codebrowse == codebounce == the loggerhead
> side.  We generally refer to both the ssh service and the more general
> area as 'codehosting'.

Indeed, I did realize this, and made the mistake here. I certainly knew
the code was under lib/lp/codehosting...

> 
>> for numbers:
>>
>>  900ms 'time (echo hello | bzr serve --inet)'
>> 2400ms 'time (echo hello | bzr lp-serve --inet)'
>>  ~80ms 'time (overly complex stuff to spawn and do the same work)'
>>
>> The last is a bit of an approximation, because my test front end hack
>> uses select.poll() over file descriptors, and doesn't seem to be getting
>> POLLHUP on any of the fifos.
> 
> I don't entirely understand the consequence of this :-)

So the big takeaway is 2.4s => 0.08s. There is some concern that we
won't easily detect when the smart server is finished going this route.
As I at least wasn't seeing it disconnect from the fifo (echo foo >
script, sends a POLLHUP event to select when the input is finished, I
wasn't seeing that when lp-serve on the other side of a fifo finished.)

...

>> 1) Plain socket.socket() service versus a 'twisted' implementation. I'm
>>    currently just creating a simple TCP/IP socket, and listening on it
>>    for a request to spawn a child (or quit), then forking, creating
>>    fifos on disk, reporting the path to the requester. The forked child
>>    then hangs on those handles, and remaps stdin/out/err and running the
>>    lp-serve code.
>>
>>    a) Eventually I would hook up the existing
>>       lib/lp/codehosting/sshserver code to make the request to the
>>       daemon, and hook up a fake ProcessTransport(?) instead of the
>>       self._transport = self.reactor.spawnProcess()
> 
> I'm not sure I entirely understand what you're planning here -- it seems
> from your branch that you could just launch the connect_to_lpservice
> process and wire up its real ProcessTransport process.
> 
> This still leaves us paying Python startup overhead, but that's not so
> bad if you don't import very much.  Then later we can move to doing what
> connect_to_lpservice does inside the ssh server, if we find that it
> would be worth it.
> 
> I don't understand why connect_to_lpservice is in bzrplugins -- it isn't
> a plugin!

Because it was a quick hack to sit next to lpserve.py I wasn't intending
for it to stay live. I was intending to pull that code into session.py

Also, as mentioned above, I'm not seeing the service stop inside of
connect_to_lpservice, which means that the 'connect' script doesn't stop
itself, which means the bzr+ssh may hang as well (I haven't tried
anything at that level).

> 
>>    b) Current service is single threaded, though I thought it would
>>       probably be enough because it only blocks long enough to fork().
> 
> The idea of multiple threads forking somehow scares me rather...
> 
>>    c) If most of the launchpad internals are based around twisted
>>       services, it probably would be best for consistency to also have a
>>       twisted service.  I personally know very very little about
>>       twisted.
> 
> I think it would be required to rewrite connect_to_lpservice using
> twisted apis should that move into the ssh server process.  I don't
> think it would make sense to rewrite the forking process using
> twisted -- I don't think forking once the reactor is running (and not
> promptly calling exec()) is really supported with Twisted.
> 
>> 2) Robert mentioned in the past 'why not just have sshserver fork
>>    itself'. This would be possible, though I'm already seeing some
>>    issues wrt isolation. (logging and bzrlib.ui.ui_factory have some
>>    state wrt sys.std* that I have to manually poke at.)
> 
> See above.
> 
>> 3) Prefork. I think the existing code would lend itself pretty easily to
>>    preforking. The forking code would probably be reworked a \
>>    little, and timing would need to be sorted out. (At the moment the
>>    spawned child doesn't return the path to the caller until the files
>>    have been created, but w/ prefork we would need a way to inform the
>>    daemon that the files exist now, though the daemon itself could
>>    create them and inform the child what path to use.)
> 
> Doesn't seem worth it for this first cut?

On my host with no load, the time to fork seems to be in the tens of
milliseconds (~30ms). It doesn't seem necessary yet to prefork. Though
it is something to keep in mind.

If we *really* wanted to keep exec() level isolation, then prefork would
be necessary. My biggest concern is that it doesn't really reduce the
amount of startup *work* being done, so if the service is busy, you'll
still be slow.

> 
>> 4) Launchpad code review, testing, landing, rollout, etc. I really don't
>>    have much knowledge beyond "submit a merge proposal". I've read
>>    dev.launchpad.net/Getting /Help and /Build. And certainly have things
>>    building locally. /Help basically says to ask here :)
> 
> Well, um.  I can see some trivial style problems in the code, but mostly
> it looks fine.  Tests are certainly needed, as is some actually hooking
> into the ssh server.
> 
>>    I guess I'm basically asking for some mentoring, and not really sure
>>    who to talk to about it. (stuff like ec2-land and running the lp test
>>    suite still scare me a bit)
> 
> Hope this helped a bit.
> 
> Cheers,
> mwh
> 

It was nice to get some feedback from you.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkxp2osACgkQJdeBCYSNAAPOgwCgm8rlf7jJDPmK8o6XeUoua5hK
OCIAoJiTbLpD3WEy1S1ZcW8JpvL44PBK
=CJ/D
-----END PGP SIGNATURE-----



Follow ups

References