← Back to team overview

dulwich-users team mailing list archive

Re: [PATCH 0 of 2] small objects cleanups

 

On Nov 16, 2010, at 9:34 AM, Jelmer Vernooij wrote:

> Hi Augie,
> 
> On Mon, 2010-11-15 at 22:31 -0600, Augie Fackler wrote:
>> On Nov 15, 2010, at 8:53 AM, Jelmer Vernooij wrote:
>>> On Sun, 2010-11-14 at 19:19 -0600, Augie Fackler wrote:
>>>> First patch is trailing whitespace cleanup courtesy of my .emacs, I
>>>> don't feel strongly about it.
>>> I'll merge this one.
>>> 
>>>> Second patch removes an assertion which only gets hit on
>>>> non-c-extension builds. I noticed it when I was using a broken install
>>>> and ended up using the Python parsing code instead of the native code,
>>>> which worked fine on the repo in question. It's been lurking around so
>>>> long I've forgotten which repository this broke on, but I can dig it
>>>> up again if you want.
>>> This assertion is actually correct. It's come up a couple of times
>>> recently because GitHub sometimes writes invalid data. Apparently this
>>> makes jgit fall over too and cgit's fsck warns about it.
>> Huh? I've seen *real* repositories that I can't use with the
>> pure-python Dulwich, but that work fine when C speedups are enabled
>> because *this exact assertion* is absent in the C code? git fsck
>> warning is very different than actually blowing up.
> The inconsistency between the C extensions and the Python implementation
> is wrong and definitely something we should fix. As I said earlier I'm
> not necessarily opposed to removing the assertion (rather than making
> the C extensions raise an AssertionError as well), but we need some way
> for API users to find that an object is not formatted properly. 
> 
> The reason I'm worried about this sort of thing is that it breaks
> roundtripping of objects:
> 
> a = my_repo[broken_tree_sha1]
> b = Tree.from_string(str(Tree()))
> assert a.id == b.id          # This will fail

Per discussion here and in IRC, sent a version that requires .check() to detect such brokenness.

> 
> This is the relevant bug on github:
> 
> http://support.github.com/discussions/repos/4341-githubs-inline-edit-breaks-git-repositories
> 
>>> I am hesitant to remove this assertion without adding some code
>>> to .check() to point it out and a test to demonstrate we now handle it
>>> correctly.
>> Happy to add that, is that what you'd like?
> Yes, please. That would address my concerns.
> 
>>>> If you'd rather have a pull request on github, that's fine, happy to
>>>> resend there. Also, please let me know if this patchbomb gives you any
>>>> problems, as I sent it with Mercurial.
>>> This seems to work correctly at the moment. I also wouldn't mind if you
>>> sent Mercurial bundles though, I still have to implement those in bzr-hg
>>> and I could use some test cases. ;-)
>> Eh, mailing bundles makes patch review hard.
> Sorry, I hadn't realized hg bundles were binary.
> 
> Cheers,
> 
> Jelmer




References