← Back to team overview

novacut-community team mailing list archive

Revisiting the 240-bit digest size

 

Repost from this bug, please feel free to respond on the bug or the
mailing list:

https://bugs.launchpad.net/filestore/+bug/1088404
----

We're getting very close to giving Dmedia that "production ready"
rubber stamp, and so it's a good time to revisit any nagging design
issues before we make long-term compatibility commitments.

The hashing protocol is the longest term commitment, and is the most
disruptive to change should we need to. The hashing protocol is also
the piece that we'd like to have the widest adoption, totally
independent of whether Dmedia is being used.

Please chime in if you feel otherwise, but personally I think there 3
options on the table:

1) Keep the current 240-bit digest size

2) Increase to a more conservative 280-bit digest size

3) Increase to an ultra conservative (and base64 friendly) 360-bit digest size

Note that regardless of the digest size, the state size is 512-bits in
all cases, because we're using Skein-512. And all three digest sizes
are a multiple of 40 bits, which means they can be base32-encoded
without requiring padding. The base32-encoded root-hash has proven
itself a excellent choice, something I think we should certainly stick
with.

Here's my pros and cons for each, to layout some of the tradeoffs:

== 240-bit ==

30 bytes; 48 characters when based32-encoded; 120-bit security (birthday bound).

Pros:

 * Big enough - just a touch smaller than the generally recommended
256-bit digest size/128-bit security

 * Not too big - very readable in JSON schema, doesn't look too unwieldy in URLs

 * Can be cleanly base64-encoded (no padding)

 * Most space-efficient in the database, least file-system overhead

Cons:

 * Bad marketing - it's still below the recommended digest size, even
if only slightly

== 280-bit ==

35 bytes; 56 characters when based32-encoded; 140-bit security (birthday bound).

Pros:

 * Good marketing - larger than recommended digest size

 * Still fairly readable, still not too unwieldy

Cons:

 * Can't be cleanly base64-encoded (requires padding)

== 360-bit ==

45 bytes; 72 characters when based32-encoded; 180-bit security (birthday bound).

Pros:

 * Can be cleanly base64-encoded (if we feel that 240-bit is too
small, but that clean base64 encoding is a must, then 360-bits is the
next rung up the ladder)

 * Great marketing - easy to argue that the protocol will have an
extremely long useful lifetime, even if Skein became as compromised as
SHA-1 is today

Cons:

 * Very long and unwieldy, poor readability - could face a lot of
adoption friction because it just "looks too big"

 * Least space-efficient in the database, most file-system overhead