dulwich team mailing list archive
-
dulwich team
-
Mailing list archive
-
Message #00017
Re: Patches: decode and verify packs as they stream
Thanks, merged!
I'll have a look at providing a pack reader that generates an index
file.
Cheers,
Jelmer
On Fri, 2010-04-16 at 12:05 -0700, David Borowitz wrote:
> Pushed the changes to my github. (I may have also rebased on some
> newer upstream changes, but nothing that didn't apply cleanly.)
>
>
> On Fri, Apr 16, 2010 at 11:44, Jelmer Vernooij <jelmer@xxxxxxxxx>
> wrote:
> Hi Dave,
>
> On Thu, 2010-04-15 at 10:27 -0700, David Borowitz wrote:
> > I just pushed 6 new patches to my github repo:
> > http://github.com/dborowitz/dulwich
> >
> >
> > 4471b29 Prevent server stack trace when the client request
> is a no-op.
> > c7a2465 Strip excess whitespace from capabilities lines.
> > 04ad718 Add ReceivableProtocol that supports recv() as well
> as read().
> > 3c97dff Fix read_zlib_chunks to avoid appending junk data.
> > 65f2c56 Make the server decode a pack as it streams.
> > be1dfac Clean up pack.py.
>
> Thanks!
>
> Some minor comments, all in protocol.py for some reason:
> [...]
> > self._recv = recv
> > self._rbuf = StringIO()
> > self._rbufsize = 8192
> Can you please put this in a constant somewhere?
>
> Done.
>
> > # consumed.
> > # Copyright (c) 2001-2010 Python Software Foundation;
> All Rights Reserved
>
> Please mention the license (PSF?).
>
>
> Done.
>
> > # rbufsize is large compared to the typical return
> value of recv().
> > buf = self._rbuf
> > start = buf.tell()
> > buf.seek(0, 2) # seek end
> Please use os.SEEK_END rather than just 2, it's a bit clearer
> to read.
>
>
> Done. (This was not my code, but I don't mind making it clearer :)
>
> I wonder if PackStreamVerifier should have a base class that
> is shared
> by PackData.iterobjects(), which seems to provide a lot of the
> same
> functionality. But I'm happy to leave that refactoring for
> later.
>
>
> I agree that it should be looked into. This class was written
> separately and is intentionally lightweight for basically two reasons:
> 1. Shawn told me that cgit and jgit both have duplicated pack
> processing code, one for iterating/unpacking objects, and one for
> verifying during streaming. The root issue IIUC is performance, but
> I'll admit I haven't run the numbers myself.
> 2. The refactoring didn't work cleanly the first time I tried it. That
> doesn't mean it necessarily can't, though.
>
> Cheers,
>
> Jelmer
>
Attachment:
signature.asc
Description: This is a digitally signed message part
References