openstack team mailing list archive

Thread
Date

Re: Some insight into the number of instances Nova needs to spin up...

To: Rick Clark <rick@xxxxxxxxxxxxx>
From: Erik Carlin <erik.carlin@xxxxxxxxxxxxx>
Date: Thu, 30 Dec 2010 18:15:28 +0000
Accept-language: en-US
Cc: "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <4D1CC481.40707@openstack.org>
Thread-index: AQHLp45+f0UWTVKzP06kcS5DWW/u/JO5f7YA//++OoCAAGlegP//pKUA
Thread-topic: [Openstack] Some insight into the number of instances Nova needs to spin up...
User-agent: Microsoft-MacOutlook/14.2.0.101115

I suggest we consider the limits of a single nova deployment, not across
all regions.  To Pete's point, at a certain scale, people will break into
parallel, independent nova deployments.  A single, global, deployment
becomes untenable at scale.

I agree with Pete that 1M hosts (and 45M Vms) is a bit out of whack for a
single nova deployment.  As a frame of reference, here are a couple of
links that estimate total server count by the big boys:

http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-we
b-servers/
http://www.intac.net/a-comparison-of-dedicated-servers-by-company_2010-04-1
3/

Google is the largest and they are estimated to run ~1M+ servers.
Microsoft is ~500K+.  Facebook is around 60K.  This link
(http://royal.pingdom.com/2009/08/24/google-may-own-more-than-2-of-all-serv
ers-in-the-world/) is a few years old and puts the total worldwide server
count at 44M.

I submit that setting the nova limit to match Google's total server count
and 1/44th of the total worldwide server count is overkill.

The limits I suggest below are not per AZ, but per nova deployment (there
could be multiple AZs inside of a deployment).  I think we may need to
clarify nomenclature (although it may just be me since I haven't been too
engaged in these discussions to date).  I know at the last design summit
it was decided to call everything a "zone".

Erik

On 12/30/10 11:42 AM, "Rick Clark" <rick@xxxxxxxxxxxxx> wrote:

>Actually it was 1M hosts and  Ivthink 45 million vms.  It was meant to
>be across all regions.  Jason Seats set the number arbitrarily, but it
>is a good target to not let us forget about scaling while we design.
>
>I think eventually all loads will be more ephemeral.  So, I think I
>agree with your numbers, if you are talking about a single availability
>zone.
>
>On 12/30/2010 11:25 AM, Erik Carlin wrote:
>> You are right.  The 1M number was VMs not hosts.  At least, that was
>>from
>> one scale discussion we had within Rackspace.  I'm not sure what the
>> "official" nova target limits are and I can't find anything on launchpad
>> that defines it.  If there is something, could someone please send me a
>> link.
>> 
>> I'm am certain that Google can manage more than 10K physical servers per
>> DC. Rackspace does this today.
>> 
>> If I combine what I know about EC2 and Cloud Servers, I would set the
>>ROM
>> scale targets as:
>> 
>> ABSOLUTE
>> 1M VMs
>> 50K hosts
>> 
>> RATE
>> 500K transactions/day (create/delete server, list servers, resize
>>server,
>> etc. - granted, some are more expensive than others)
>> That works out to ~21K/hr but it won't be evenly distributed.  To allow
>> for peak, I would say something like 75K/hour or ~21/sec.
>> 
>> Thoughts?
>> 
>> Erik
>> 
>> 
>> On 12/30/10 9:20 AM, "Pete Zaitcev" <zaitcev@xxxxxxxxxx> wrote:
>> 
>>> On Wed, 29 Dec 2010 19:27:09 +0000
>>> Erik Carlin <erik.carlin@xxxxxxxxxxxxx> wrote:
>>>
>>>> The 1M host limit still seems reasonable to me. []
>>>
>>> In my opinion, such numbers are completely out of whack. Google's
>>>Chubby
>>> article says that the busiest Chubby has 90,000 clients (not hosts!)
>>>and
>>> the biggest datacenter has 10,000 systems. They found such numbers
>>> pushing the border of unmanageable. Granted they did not use
>>> virtualization, but we're talking the number of boxes in both cases.
>>>
>>> So to reach 1M hosts in a Nova instance you have to have it manage
>>> 100 datacenters. There are going to be calls for federation of Novas
>>> long before this number is reached.
>>>
>>> Sustaining a high flap rate is a worthy goal and will have an
>>> important practical impact. And having realistic sizing ideas is
>>> going to help it.
>>>
>>> -- Pete
>> 
>> 
>> 
>> Confidentiality Notice: This e-mail message (including any attached or
>> embedded documents) is intended for the exclusive and confidential use
>>of the
>> individual or entity to which this message is addressed, and unless
>>otherwise
>> expressly indicated, is confidential and privileged information of
>>Rackspace. 
>> Any dissemination, distribution or copying of the enclosed material is
>>prohibited.
>> If you receive this transmission in error, please notify us immediately
>>by e-mail
>> at abuse@xxxxxxxxxxxxx, and delete the original message.
>> Your cooperation is appreciated.
>> 
>> 
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>

Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use of the
individual or entity to which this message is addressed, and unless otherwise
expressly indicated, is confidential and privileged information of Rackspace. 
Any dissemination, distribution or copying of the enclosed material is prohibited.
If you receive this transmission in error, please notify us immediately by e-mail
at abuse@xxxxxxxxxxxxx, and delete the original message. 
Your cooperation is appreciated.

Follow ups

Re: Some insight into the number of instances Nova needs to spin up...
From: Bret Piatt, 2010-12-30

References

Re: Some insight into the number of instances Nova needs to spin up...
From: Rick Clark, 2010-12-30