openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #09972
Swift block-level deduplication
Folks,
>From previous posts on the ML, it seems there are a couple of
efforts in train to add distributed content deduping to Swift.
My question is whether either or both these approaches involve
active client participation in enabling duplicate chunk
detection?
One could see a spectrum ranging between:
1. Client actively breaks the object into chunks, selects the
hashing algorithm, calculates fingerprint and then only uploads
if Swift reports that fingerprint is unknown.
2. Client determines which objects are worth deduping, maybe has
some influence on chunk size and/or hashing, but fingerprint
calculation is all handled internally by Swift.
3. Client is entirely uninvolved, deduplication is handled
transparently in the object storage layer and enabled either
globally or per-container.
If anyone involved has insight into the above, I'd be interested
in hearing your thoughts (the context is leveraging dedupe in glance).
Cheers,
Eoghan
Follow ups