← Back to team overview

launchpad-dev team mailing list archive

plan for incremental code imports

 

We want to make code imports, or at least the ones done with a foreign
branch plugin, import incrementally.  This will worm around some
resource leaks somewhere in the import plugin or bzr and allow us to
import really large repos like linux or firefox, but also will make
scheduling fairer and reduce the damage done by a network blip.

This requires some infrastructure work to support an import status of
"partially successful" and so on, but I know how to do that.  The part
I'm a bit less sure of is how to do the "only import $N revisions" bit.

One way would be to not try too hard, and import only $N _mainline_
revisions each time.  I think code like this could do that:

local_branch = ...
foreign_branch = ...
local_revno = local_branch.revno()
foreign_revno = foreign_branch.revno()
target_revno = max(local_revno + $N, foreign_revno)
target_revid = foreign_branch.get_revid(target_revno)
local_branch.pull(foreign_branch, stop_revision=target_revid)
if target_revno == foreign_revno:
    return SUCCESS
else:
    return PARTIAL_SUCCESS

What I don't know is if this will be very efficient at all; does
get_revid() on a mercurial or svn or git branch perform acceptably?

It's also a bit lame in that it would be better to only import $N
_revisions_ at a time, not mainline revisions.  But I don't know how to
do that.  The above sketch might be good enough in any case.

The other thing that should be done is changing our bzr-git importer to
preserve the git pack files between partial imports, by changing bzr-git
to put them in a predictable location and then doing some work in the
importer to preserve them.  I think I'd rather Jelmer look at this part,
or at least provide me with very detailed instructions ...

Cheers,
mwh



Follow ups