← Back to team overview

dulwich-users team mailing list archive

Re: Patches: sorted_tree_items, header cleanup, MemoryRepo, logging

 

I'm a bit torn on this issue. On the one hand, I agree that there are some
things like excludes that we can provide well-defined APIs for rather than
depending on specific filenames. On the other hand, there are two cases that
I think merit keeping get_named_file around:
1. Allow users to read and parse files on their own that Dulwich does not
(yet) have support for. (Presumably we would encourage such users to
contribute their parsing code back to Dulwich and/or transition to the
official API once one exists.)
2. It's necessary for the implementation of dumb HTTP serving.

I'm still slightly for using '/' to be consistent with refnames and URL
representations in the HTTP server. Note that both of these points are
orthogonal to the choice of path separator.

On Wed, Jun 9, 2010 at 17:46, Jelmer Vernooij <jelmer@xxxxxxxxx> wrote:

> On Wed, Jun 09, 2010 at 02:48:18PM -0500, Augie Fackler wrote:
> > On Mon, Jun 7, 2010 at 5:32 PM, Jelmer Vernooij <jelmer@xxxxxxxxx>
> wrote:
> > > On Mon, 2010-06-07 at 15:59 -0500, Augie Fackler wrote:
> > >> On May 30, 2010, at 9:18 AM, Jelmer Vernooij wrote:
> > >> > On Mon, May 24, 2010 at 11:08:56AM -0700, David Borowitz wrote:
> > >> > > This is quite the grab-bag of changes :). Available as usual at my
> > >> > > github
> > >> > > fork:
> > >> > > http://github.com/dborowitz/dulwich
> > >> > >
> > >> > > I think Augie has requested that we do code reviews in some place
> > >> > > public,
> > >> > > like this mailing list. If others agree I'm happy to oblige by
> > >> > > mailing
> > >> > > patches.
> > >> > That'd be great - mailed patches are a bit easier to comment on for
> > >> > me, and this
> > >> > mailing list seems like the most appropriate place.
> > >> >
> > >> > > b9677d0 Add tests for sorted_tree_items and C implementation.
> > >> > > 186fb3a Fix memory leak in C implementation of sorted_tree_items.
> > >> > > 0994d03 Clean up file headers.
> > >> > > 4a42aad Fix copyright in dul-web.
> > >> > I've merged these, thanks.
> > >> >
> > >> > > e146b26 Move named file initilization to BaseRepo.
> > >> > Thanks, this makes sense.
> > >> >
> > >> > We should've really started out by naming BaseRepo Repo and naming
> > >> > Repo
> > >> > DiskRepo or something, but it seems hard to change that now without
> > >> > breaking
> > >> > everybody's existing code. :-(
> > >> >
> > >> > > 9f08973 Use / as separator for named files on all platforms.
> > >> > I'm not sure about this change - other than the use of info/excludes
> > >> > we don't
> > >> > actually appear to use / in calls to {get,put}_named_file so it
> > >> > seems like a
> > >> > silly thing to break the API for.
> > >> Is it an API break then? It also means that you can always request
> > >> 'info/excludes' without respect to the platform you're on.
> > > It changes the behaviour of this particular call on some platforms
> > > (specifying info\excludes on Windows will break), so users will have to
> > > change their callsites.
> > >
> > >> > Alternatively, perhaps we can make get_named_file private/protected?
> > >> > I'd be
> > >> > less hesitant about making this sort of change if we do that.
> > >
> > >> -1, it feels like clients of Repo will need to load files within .git
> > >> and get_named_file feels like the correct call for this.
> > > Do they need to do that directly though, and for what sort of
> > > functionality? I'd rather see higher-level function calls on Repo, in
> > > particular because that makes it easier to e.g. implement the Repo API
> > > on top of other databases with a different on-disk format.
> >
> > It seems reasonable that we'll need access to (for example)
> > info/excludes so we can construct the appropriate matcher for
> > Mercurial, Bazaar or whatever else. It seems kind of odd to my mind to
> > have a function per special file on the repo object.
> I think it's reasonable to have a separate function, the fact that the
> exclude info can be found in info/excludes is a representation issue.
>
> I'd rather have a function that returns a list of exclude patterns for
> which we can put in a stub that e.g. returns [] or uses some other magic.
> The alternative is to have a get_named_file() implementation that returns
> custom file contents (the disk format rather than the content) based on the
> filename that gets passed in. The implementation for a bzr repo would
> look something like this:
>
> def get_named_file(self, name):
>        if name == "info/excludes":
>                return StringIO()
>        elif name == "description":
>                return "A Bazaar repository"
>        elif name =="config":
>                # Construct a ConfigObj, serialize it and return it as a
>                # file-like object
>        elif name.startswith("refs/heads/"):
>                # Some magic involving a RefsContainer and reserializing
>        else:
>                ...
>
> Making the disk format part of the API is something I'd like to avoid as
> much
> as possible as it makes it hard (and slow) to implement the API on top of
> different disk representations (not just bzr or mercurial but also on top
> of
> say a highly scalable database system). I don't think having a function for
> getting at the excludes would make the API that much more complicated,
> there
> aren't much similar files like this.
>
> Cheers,
>
> Jelmer
>

Follow ups

References