← Back to team overview

dulwich team mailing list archive

Re: Patches: decode and verify packs as they stream

 

Pushed the changes to my github. (I may have also rebased on some newer
upstream changes, but nothing that didn't apply cleanly.)

On Fri, Apr 16, 2010 at 11:44, Jelmer Vernooij <jelmer@xxxxxxxxx> wrote:

> Hi Dave,
>
> On Thu, 2010-04-15 at 10:27 -0700, David Borowitz wrote:
> > I just pushed 6 new patches to my github repo:
> > http://github.com/dborowitz/dulwich
> >
> >
> > 4471b29 Prevent server stack trace when the client request is a no-op.
> > c7a2465 Strip excess whitespace from capabilities lines.
> > 04ad718 Add ReceivableProtocol that supports recv() as well as read().
> > 3c97dff Fix read_zlib_chunks to avoid appending junk data.
> > 65f2c56 Make the server decode a pack as it streams.
> > be1dfac Clean up pack.py.
> Thanks!
>
> Some minor comments, all in protocol.py for some reason:
> [...]
> >         self._recv = recv
> >         self._rbuf = StringIO()
> >         self._rbufsize = 8192
> Can you please put this in a constant somewhere?
>

Done.


> >        #    consumed.
> >        # Copyright (c) 2001-2010 Python Software Foundation; All Rights
> Reserved
>
> Please mention the license (PSF?).
>

Done.


> >         # rbufsize is large compared to the typical return value of
> recv().
> >         buf = self._rbuf
> >         start = buf.tell()
> >         buf.seek(0, 2)  # seek end
> Please use os.SEEK_END rather than just 2, it's a bit clearer to read.
>

Done. (This was not my code, but I don't mind making it clearer :)


> I wonder if PackStreamVerifier should have a base class that is shared
> by PackData.iterobjects(), which seems to provide a lot of the same
> functionality. But I'm happy to leave that refactoring for later.
>

I agree that it should be looked into. This class was written separately and
is intentionally lightweight for basically two reasons:
1. Shawn told me that cgit and jgit both have duplicated pack processing
code, one for iterating/unpacking objects, and one for verifying during
streaming. The root issue IIUC is performance, but I'll admit I haven't run
the numbers myself.
2. The refactoring didn't work cleanly the first time I tried it. That
doesn't mean it necessarily can't, though.


> Cheers,
>
> Jelmer
>

Follow ups

References