launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #04314
Re: Proof-of-concept 'forking' lp-serve
On Mon, 16 Aug 2010 19:40:43 -0500, John Arbash Meinel <john@xxxxxxxxxxxxxxxxx> wrote:
> >> for numbers:
> >>
> >> 900ms 'time (echo hello | bzr serve --inet)'
> >> 2400ms 'time (echo hello | bzr lp-serve --inet)'
> >> ~80ms 'time (overly complex stuff to spawn and do the same work)'
> >>
> >> The last is a bit of an approximation, because my test front end hack
> >> uses select.poll() over file descriptors, and doesn't seem to be getting
> >> POLLHUP on any of the fifos.
> >
> > I don't entirely understand the consequence of this :-)
>
> So the big takeaway is 2.4s => 0.08s.
I understood that bit! :-)
> There is some concern that we won't easily detect when the smart
> server is finished going this route. As I at least wasn't seeing it
> disconnect from the fifo (echo foo > script, sends a POLLHUP event to
> select when the input is finished, I wasn't seeing that when lp-serve
> on the other side of a fifo finished.)
Ah, I see.
> >> 1) Plain socket.socket() service versus a 'twisted' implementation. I'm
> >> currently just creating a simple TCP/IP socket, and listening on it
> >> for a request to spawn a child (or quit), then forking, creating
> >> fifos on disk, reporting the path to the requester. The forked child
> >> then hangs on those handles, and remaps stdin/out/err and running the
> >> lp-serve code.
> >>
> >> a) Eventually I would hook up the existing
> >> lib/lp/codehosting/sshserver code to make the request to the
> >> daemon, and hook up a fake ProcessTransport(?) instead of the
> >> self._transport = self.reactor.spawnProcess()
> >
> > I'm not sure I entirely understand what you're planning here -- it seems
> > from your branch that you could just launch the connect_to_lpservice
> > process and wire up its real ProcessTransport process.
> >
> > This still leaves us paying Python startup overhead, but that's not so
> > bad if you don't import very much. Then later we can move to doing what
> > connect_to_lpservice does inside the ssh server, if we find that it
> > would be worth it.
> >
> > I don't understand why connect_to_lpservice is in bzrplugins -- it isn't
> > a plugin!
>
> Because it was a quick hack to sit next to lpserve.py I wasn't intending
> for it to stay live. I was intending to pull that code into session.py
Ah ok. Then you'll be wanting to rewrite it to be twisty, I guess.
Shouldn't be too hard.
> Also, as mentioned above, I'm not seeing the service stop inside of
> connect_to_lpservice, which means that the 'connect' script doesn't stop
> itself, which means the bzr+ssh may hang as well (I haven't tried
> anything at that level).
OK. Nothing helpful leaps to mind for me here, I'm afraid.
> >> b) Current service is single threaded, though I thought it would
> >> probably be enough because it only blocks long enough to fork().
> >
> > The idea of multiple threads forking somehow scares me rather...
> >
> >> c) If most of the launchpad internals are based around twisted
> >> services, it probably would be best for consistency to also have a
> >> twisted service. I personally know very very little about
> >> twisted.
> >
> > I think it would be required to rewrite connect_to_lpservice using
> > twisted apis should that move into the ssh server process. I don't
> > think it would make sense to rewrite the forking process using
> > twisted -- I don't think forking once the reactor is running (and not
> > promptly calling exec()) is really supported with Twisted.
> >
> >> 2) Robert mentioned in the past 'why not just have sshserver fork
> >> itself'. This would be possible, though I'm already seeing some
> >> issues wrt isolation. (logging and bzrlib.ui.ui_factory have some
> >> state wrt sys.std* that I have to manually poke at.)
> >
> > See above.
> >
> >> 3) Prefork. I think the existing code would lend itself pretty easily to
> >> preforking. The forking code would probably be reworked a \
> >> little, and timing would need to be sorted out. (At the moment the
> >> spawned child doesn't return the path to the caller until the files
> >> have been created, but w/ prefork we would need a way to inform the
> >> daemon that the files exist now, though the daemon itself could
> >> create them and inform the child what path to use.)
> >
> > Doesn't seem worth it for this first cut?
>
> On my host with no load, the time to fork seems to be in the tens of
> milliseconds (~30ms). It doesn't seem necessary yet to prefork. Though
> it is something to keep in mind.
OK.
> If we *really* wanted to keep exec() level isolation, then prefork would
> be necessary. My biggest concern is that it doesn't really reduce the
> amount of startup *work* being done, so if the service is busy, you'll
> still be slow.
Right.
> >> 4) Launchpad code review, testing, landing, rollout, etc. I really don't
> >> have much knowledge beyond "submit a merge proposal". I've read
> >> dev.launchpad.net/Getting /Help and /Build. And certainly have things
> >> building locally. /Help basically says to ask here :)
> >
> > Well, um. I can see some trivial style problems in the code, but mostly
> > it looks fine. Tests are certainly needed, as is some actually hooking
> > into the ssh server.
> >
> >> I guess I'm basically asking for some mentoring, and not really sure
> >> who to talk to about it. (stuff like ec2-land and running the lp test
> >> suite still scare me a bit)
> >
> > Hope this helped a bit.
> >
> > Cheers,
> > mwh
> >
>
> It was nice to get some feedback from you.
I'm glad to hear it.
Cheers,
mwh
Follow ups
References