← Back to team overview

dulwich-users team mailing list archive

Re: My Dulwich roadmap

 

Hi Dave!

On Tue, 2010-10-05 at 13:10 -0700, David Borowitz wrote:
> After a few months of not too much development, I've come up with some
> new Dulwich functionality I want to get started on.

> First, I think we need a good abstraction for walking between sets of
> commits, providing the functionality of 'git log'. In addition to
> providing 'dulwich log', other git implementations use this
> abstraction internally for doing all the commit walking for other
> algorithms (e.g. on the server side). If you look at the git-log(1)
> manpage, there is a *lot* of functionality in there, so I would only
> expect a subset of it to get implemented in the near future.
Cool! As we were discussing with Jeremy today, Repo.revision_history()
is suboptimal and it'd be great if it can be replaced with something
that scales better and is more flexible.

> Part of log walking is rename detection (though I'll probably end up
> implementing it first). Jelmer, I don't know if you've implemented
> some of this already for bzr-git, but if you have, I'd be happy to
> take a look at that.
I haven't looked at this yet. Rename detection is a hard problem for
bzr-git, as the mapping between bzr and git revisions is fixed and bzr
stores renames. Changing the rename detection algorithm would involve
breaking the existing mapping (which describes how to map between bzr
and git semantics). It is this relation:

(git_commit_sha, bzr_git_mapping) <-> bzr_revid

It is possible to make changes in the bzr_git_mapping, but it means
changing the revision ids of the bzr revisions that are created when
importing a revision from git so I can't do it too often or I will be
lynched by my users. The bzr_git_mapping hasn't actually changed since
I've started doing bzr-git releases.

Since I don't expect to get the rename detection right the first time
I've basically ignored this topic for the time being. I am hoping Bazaar
can support "rename inference" as some sort of special mode in the
future, and we could use this when importing revisions from Git.

> The other big thing I'd like to do is get delta generation for
> packfiles working. I've seen the commented-out code in pack.py, so
> hopefully that just needs some cleanup and testing. Of course, C git's
> packing heuristics are pretty sophisticated, so there's likely lots of
> room for improvement on top of that.

> Now, Shawn Pearce has told me from his experience with JGit that
> implementing all the features of log walking and delta generation can
> easily take many person-months to do correctly and efficiently. As you
> may expect, I'm looking to get some subset of the functionality
> working in a non-optimized way in a shorter timeframe. So, as usual,
> expect feature improvements to be small and incremental at first.
Yay for small, easily reviewable changesets. :-) I'm looking forward to
better pack generation.

Cheers,

Jelmer

Attachment: signature.asc
Description: This is a digitally signed message part


References