← Back to team overview

dulwich-users team mailing list archive

Re: Patches: sorted_tree_items, header cleanup, MemoryRepo, logging

 

On Wed, Jun 09, 2010 at 02:48:18PM -0500, Augie Fackler wrote:
> On Mon, Jun 7, 2010 at 5:32 PM, Jelmer Vernooij <jelmer@xxxxxxxxx> wrote:
> > On Mon, 2010-06-07 at 15:59 -0500, Augie Fackler wrote:
> >> On May 30, 2010, at 9:18 AM, Jelmer Vernooij wrote:
> >> > On Mon, May 24, 2010 at 11:08:56AM -0700, David Borowitz wrote:
> >> > > This is quite the grab-bag of changes :). Available as usual at my
> >> > > github
> >> > > fork:
> >> > > http://github.com/dborowitz/dulwich
> >> > >
> >> > > I think Augie has requested that we do code reviews in some place
> >> > > public,
> >> > > like this mailing list. If others agree I'm happy to oblige by
> >> > > mailing
> >> > > patches.
> >> > That'd be great - mailed patches are a bit easier to comment on for
> >> > me, and this
> >> > mailing list seems like the most appropriate place.
> >> >
> >> > > b9677d0 Add tests for sorted_tree_items and C implementation.
> >> > > 186fb3a Fix memory leak in C implementation of sorted_tree_items.
> >> > > 0994d03 Clean up file headers.
> >> > > 4a42aad Fix copyright in dul-web.
> >> > I've merged these, thanks.
> >> >
> >> > > e146b26 Move named file initilization to BaseRepo.
> >> > Thanks, this makes sense.
> >> >
> >> > We should've really started out by naming BaseRepo Repo and naming
> >> > Repo
> >> > DiskRepo or something, but it seems hard to change that now without
> >> > breaking
> >> > everybody's existing code. :-(
> >> >
> >> > > 9f08973 Use / as separator for named files on all platforms.
> >> > I'm not sure about this change - other than the use of info/excludes
> >> > we don't
> >> > actually appear to use / in calls to {get,put}_named_file so it
> >> > seems like a
> >> > silly thing to break the API for.
> >> Is it an API break then? It also means that you can always request
> >> 'info/excludes' without respect to the platform you're on.
> > It changes the behaviour of this particular call on some platforms
> > (specifying info\excludes on Windows will break), so users will have to
> > change their callsites.
> >
> >> > Alternatively, perhaps we can make get_named_file private/protected?
> >> > I'd be
> >> > less hesitant about making this sort of change if we do that.
> >
> >> -1, it feels like clients of Repo will need to load files within .git
> >> and get_named_file feels like the correct call for this.
> > Do they need to do that directly though, and for what sort of
> > functionality? I'd rather see higher-level function calls on Repo, in
> > particular because that makes it easier to e.g. implement the Repo API
> > on top of other databases with a different on-disk format.
> 
> It seems reasonable that we'll need access to (for example)
> info/excludes so we can construct the appropriate matcher for
> Mercurial, Bazaar or whatever else. It seems kind of odd to my mind to
> have a function per special file on the repo object.
I think it's reasonable to have a separate function, the fact that the 
exclude info can be found in info/excludes is a representation issue.

I'd rather have a function that returns a list of exclude patterns for 
which we can put in a stub that e.g. returns [] or uses some other magic. 
The alternative is to have a get_named_file() implementation that returns 
custom file contents (the disk format rather than the content) based on the 
filename that gets passed in. The implementation for a bzr repo would 
look something like this:

def get_named_file(self, name):
	if name == "info/excludes":
		return StringIO()
	elif name == "description":
		return "A Bazaar repository"
	elif name =="config":
		# Construct a ConfigObj, serialize it and return it as a 
		# file-like object
	elif name.startswith("refs/heads/"):
		# Some magic involving a RefsContainer and reserializing
	else:
		...

Making the disk format part of the API is something I'd like to avoid as much 
as possible as it makes it hard (and slow) to implement the API on top of 
different disk representations (not just bzr or mercurial but also on top of 
say a highly scalable database system). I don't think having a function for
getting at the excludes would make the API that much more complicated, there 
aren't much similar files like this.

Cheers,

Jelmer



Follow ups

References