← Back to team overview

dulwich-users team mailing list archive

Re: [PATCH 0/33] Rewrite and speed up pack inflation

 

I forgot to mention, my patches are also available on my personal Dulwich
clone on code.google.com:
http://code.google.com/r/dborowitz-dulwich/source/list

On Tue, Jul 26, 2011 at 22:02, <dborowitz@xxxxxxxxxx> wrote:

> The first of several patchbombs of code in use by the servers at
> code.google.com :)
>
> This one accomplishes two big things:
> 1. A big rewrite of the pack inflation and indexing code, which organizes
> reads
>   around chains of deltas. This guarantees that each object is read and
> inflated
>   exactly once by organizing reads around delta chains. Overall, improves
> pack
>   indexing performance of large packs like linux-2.6.git by at least 4x,
> and
>   is "only" 2-3x slower than the optimized C git implementation
> 2. Adding an UnpackedGitObject that encapsulates some of the data formerly
>   passed around as tuples from the various functions in pack.py. Rather
> than
>   constant multiple return value packing and unpacking, just pass around
> single
>   objects and mutate their state.
>
> Various additional cleanups on top of these.
>
> 4c6e275 pack: Standardize on single quotes.
> 8ec2987 pack: Clean up unpack_object.
> 9b3fc71 pack: Compute CRC32 during object unpacking.
> de42ceb pack: Inline PackObjectIterator.
> 764ec04 object_store: Fix return type of MemoryObjectStore.get_raw.
> 3c4ec6a pack: Add a DeltaChainIterator for faster iteration of PackData.
> 5cd5f24 Make the server thread raise errors in compat tests.
> 8dceaac server: Fix short-circuit behavior for no-op fetches.
> 5651324 pack: Add a PackIndexer to index packs more quickly.
> 5690a9b pack: PackStreamReader SHA calculation and docstring cleanup.
> 7caefca pack: Expose which refs were external in DeltaChainIterator.
> 280f4b0 pack: Allow write_pack_object to compute a SHA.
> f2000f2 server: Make PackStreamCopier optionally record delta chains.
> 3ded386 tests: Move write_pack_data to utils.build_pack.
> b18c613 misc: Add SEEK_CUR.
> 2e0ffd3 pack: Include offset in PackStreamReader results.
> d8eb15a tests/utils: Pass a file object into build_pack.
> 0d9deaa Move PackStreamReader from server to pack.
> 1e8d184 pack: use SEEK_END for PackData.get_stored_checksum().
> b032b1e test_pack: Test checksum and length mismatch conditions.
> 9286dd6 pack: Extract a method to check pack length and SHA.
> fe363ec pack: Extract a function to compute the SHA of a file.
> 17e1378 Rewrite add_thin_pack to use the fast PackIndexer.
> 9facb62 pack: Pass a zlib buffer size through to read_zlib_chunks.
> 870e006 pack: Fix a buffering issue with PackStreamReader; add tests.
> 40b145c pack: Add PackInflater to quickly inflate pack objects.
> 3f7b7dc pack: Nuke ThinPackData.
> a2a6078 _compat: Use namedtuple recipe rather than hard-coding.
> 911076c _compat: Inline specific namedtuple instances.
> 23095f8 pack: Create an _UnpackedObject for better encapsulation.
> 00e2b20 pack: Add option to include compressed data in _UnpackedObjects.
> 91aba9d pack: Remove comp_len from _UnpackedObject.
> d0174c1 pack: Extract a function to write a packed object header.
>
>  dulwich/_compat.py                  |  203 ++++++----
>  dulwich/diff_tree.py                |    6 +-
>  dulwich/object_store.py             |  116 ++++--
>  dulwich/objects.py                  |    6 +-
>  dulwich/pack.py                     |  766
> ++++++++++++++++++++++++-----------
>  dulwich/repo.py                     |    2 -
>  dulwich/server.py                   |   80 ++--
>  dulwich/tests/compat/test_server.py |    3 +-
>  dulwich/tests/compat/test_web.py    |   11 +-
>  dulwich/tests/test_object_store.py  |   29 ++-
>  dulwich/tests/test_pack.py          |  435 ++++++++++++++++++--
>  dulwich/tests/test_server.py        |    2 +-
>  dulwich/tests/utils.py              |   83 ++++
>  13 files changed, 1287 insertions(+), 455 deletions(-)
>

References