← Back to team overview

dulwich-users team mailing list archive

Re: [PATCH 0 of 2] small objects cleanups

 

Hi Augie,

On Mon, 2010-11-15 at 22:31 -0600, Augie Fackler wrote:
> On Nov 15, 2010, at 8:53 AM, Jelmer Vernooij wrote:
> > On Sun, 2010-11-14 at 19:19 -0600, Augie Fackler wrote:
> >> First patch is trailing whitespace cleanup courtesy of my .emacs, I
> >> don't feel strongly about it.
> > I'll merge this one.
> > 
> >> Second patch removes an assertion which only gets hit on
> >> non-c-extension builds. I noticed it when I was using a broken install
> >> and ended up using the Python parsing code instead of the native code,
> >> which worked fine on the repo in question. It's been lurking around so
> >> long I've forgotten which repository this broke on, but I can dig it
> >> up again if you want.
> > This assertion is actually correct. It's come up a couple of times
> > recently because GitHub sometimes writes invalid data. Apparently this
> > makes jgit fall over too and cgit's fsck warns about it.
> Huh? I've seen *real* repositories that I can't use with the
> pure-python Dulwich, but that work fine when C speedups are enabled
> because *this exact assertion* is absent in the C code? git fsck
> warning is very different than actually blowing up.
The inconsistency between the C extensions and the Python implementation
is wrong and definitely something we should fix. As I said earlier I'm
not necessarily opposed to removing the assertion (rather than making
the C extensions raise an AssertionError as well), but we need some way
for API users to find that an object is not formatted properly. 

The reason I'm worried about this sort of thing is that it breaks
roundtripping of objects:

a = my_repo[broken_tree_sha1]
b = Tree.from_string(str(Tree()))
assert a.id == b.id          # This will fail

This is the relevant bug on github:

http://support.github.com/discussions/repos/4341-githubs-inline-edit-breaks-git-repositories

> > I am hesitant to remove this assertion without adding some code
> > to .check() to point it out and a test to demonstrate we now handle it
> > correctly.
> Happy to add that, is that what you'd like?
Yes, please. That would address my concerns.

> >> If you'd rather have a pull request on github, that's fine, happy to
> >> resend there. Also, please let me know if this patchbomb gives you any
> >> problems, as I sent it with Mercurial.
> > This seems to work correctly at the moment. I also wouldn't mind if you
> > sent Mercurial bundles though, I still have to implement those in bzr-hg
> > and I could use some test cases. ;-)
> Eh, mailing bundles makes patch review hard.
Sorry, I hadn't realized hg bundles were binary.

Cheers,

Jelmer

Attachment: signature.asc
Description: This is a digitally signed message part


Follow ups

References