dulwich-users team mailing list archive
-
dulwich-users team
-
Mailing list archive
-
Message #00557
Re: [PATCH 0/33] Rewrite and speed up pack inflation
I forgot to mention, my patches are also available on my personal Dulwich
clone on code.google.com:
http://code.google.com/r/dborowitz-dulwich/source/list
On Tue, Jul 26, 2011 at 22:02, <dborowitz@xxxxxxxxxx> wrote:
> The first of several patchbombs of code in use by the servers at
> code.google.com :)
>
> This one accomplishes two big things:
> 1. A big rewrite of the pack inflation and indexing code, which organizes
> reads
> around chains of deltas. This guarantees that each object is read and
> inflated
> exactly once by organizing reads around delta chains. Overall, improves
> pack
> indexing performance of large packs like linux-2.6.git by at least 4x,
> and
> is "only" 2-3x slower than the optimized C git implementation
> 2. Adding an UnpackedGitObject that encapsulates some of the data formerly
> passed around as tuples from the various functions in pack.py. Rather
> than
> constant multiple return value packing and unpacking, just pass around
> single
> objects and mutate their state.
>
> Various additional cleanups on top of these.
>
> 4c6e275 pack: Standardize on single quotes.
> 8ec2987 pack: Clean up unpack_object.
> 9b3fc71 pack: Compute CRC32 during object unpacking.
> de42ceb pack: Inline PackObjectIterator.
> 764ec04 object_store: Fix return type of MemoryObjectStore.get_raw.
> 3c4ec6a pack: Add a DeltaChainIterator for faster iteration of PackData.
> 5cd5f24 Make the server thread raise errors in compat tests.
> 8dceaac server: Fix short-circuit behavior for no-op fetches.
> 5651324 pack: Add a PackIndexer to index packs more quickly.
> 5690a9b pack: PackStreamReader SHA calculation and docstring cleanup.
> 7caefca pack: Expose which refs were external in DeltaChainIterator.
> 280f4b0 pack: Allow write_pack_object to compute a SHA.
> f2000f2 server: Make PackStreamCopier optionally record delta chains.
> 3ded386 tests: Move write_pack_data to utils.build_pack.
> b18c613 misc: Add SEEK_CUR.
> 2e0ffd3 pack: Include offset in PackStreamReader results.
> d8eb15a tests/utils: Pass a file object into build_pack.
> 0d9deaa Move PackStreamReader from server to pack.
> 1e8d184 pack: use SEEK_END for PackData.get_stored_checksum().
> b032b1e test_pack: Test checksum and length mismatch conditions.
> 9286dd6 pack: Extract a method to check pack length and SHA.
> fe363ec pack: Extract a function to compute the SHA of a file.
> 17e1378 Rewrite add_thin_pack to use the fast PackIndexer.
> 9facb62 pack: Pass a zlib buffer size through to read_zlib_chunks.
> 870e006 pack: Fix a buffering issue with PackStreamReader; add tests.
> 40b145c pack: Add PackInflater to quickly inflate pack objects.
> 3f7b7dc pack: Nuke ThinPackData.
> a2a6078 _compat: Use namedtuple recipe rather than hard-coding.
> 911076c _compat: Inline specific namedtuple instances.
> 23095f8 pack: Create an _UnpackedObject for better encapsulation.
> 00e2b20 pack: Add option to include compressed data in _UnpackedObjects.
> 91aba9d pack: Remove comp_len from _UnpackedObject.
> d0174c1 pack: Extract a function to write a packed object header.
>
> dulwich/_compat.py | 203 ++++++----
> dulwich/diff_tree.py | 6 +-
> dulwich/object_store.py | 116 ++++--
> dulwich/objects.py | 6 +-
> dulwich/pack.py | 766
> ++++++++++++++++++++++++-----------
> dulwich/repo.py | 2 -
> dulwich/server.py | 80 ++--
> dulwich/tests/compat/test_server.py | 3 +-
> dulwich/tests/compat/test_web.py | 11 +-
> dulwich/tests/test_object_store.py | 29 ++-
> dulwich/tests/test_pack.py | 435 ++++++++++++++++++--
> dulwich/tests/test_server.py | 2 +-
> dulwich/tests/utils.py | 83 ++++
> 13 files changed, 1287 insertions(+), 455 deletions(-)
>
References