← Back to team overview

nova team mailing list archive

Re: Instance IDs and models in the ORM world


It's not about the size of the collision space.  It's about the
scale-killing impact of enforcing sequential integers (auto-increment
keys) across multiple nodes.  A random 32-bit integer would be fine as
long as the generating algorithm produced acceptable collision rates.


On Tue, Oct 5, 2010 at 2:39 PM, Michael Gundlach
<michael.gundlach@xxxxxxxxxxxxx> wrote:
> On Tue, Oct 5, 2010 at 2:31 PM, Ewan Mellor <ewan.mellor@xxxxxxxxxxxxx>
> wrote:
>> Thanks Michael.  It would be great to see this in some doccomments in the
>> code somewhere.
> Yeah, I agree -- I first touched this code to rename ec2_id to internal_id,
> and it's been a bit of a pain because there's a dearth of comments (and some
> ambiguous naming, like "instance_id" referring to id in some places and to
> ec2_id in other places.)  I plan to clean this code up as one of my post-FF
> tasks.
>> I agree with Jay's comments elsewhere in this thread -- it seems a better
>> idea to use a UUID for your internal_id, rather than a long int.  That
>> way,
>> the ID is an extra 64 bits longer, so you can generate them randomly on
>> independent nodes without worrying about collisions.
> I think we discussed this in IRC -- 128 bits turns into a 26 byte EC2 ID vs
> 14 for a 64 bit int, and someone (Soren?) had a strong negative preference.
> Actually, I just checked in code only using a 32 bit integer, and I'd like
> some convincing that this isn't sufficient for the foreseeable future.  We
> want to support 1 million instances, right?  Which is 1/4000 the keyspace,
> so we have something like a 1 in 4000 chance of a collision once we hit a
> million instances.  I know I'm not doing the statistics properly, since the
> proper question is "what is the chance of at least one collision when
> successively generating one million random 32-bit integers?", but it feels
> like we can punt on larger values at least until Bexar.
> Thoughts?
> Michael
> Confidentiality Notice: This e-mail message (including any attached or
> embedded documents) is intended for the exclusive and confidential use of
> the
> individual or entity to which this message is addressed, and unless
> otherwise
> expressly indicated, is confidential and privileged information of
> Rackspace.
> Any dissemination, distribution or copying of the enclosed material is
> prohibited.
> If you receive this transmission in error, please notify us immediately by
> e-mail
> at abuse@xxxxxxxxxxxxx, and delete the original message.
> Your cooperation is appreciated.
> _______________________________________________
> Mailing list: https://launchpad.net/~nova
> Post to     : nova@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~nova
> More help   : https://help.launchpad.net/ListHelp

Follow ups