openstack team mailing list archive

Thread
Date

Re: [Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

To: openstack@xxxxxxxxxxxxxxxxxxx
From: Alejandro Comisario <alejandro.comisario@xxxxxxxxxxxxxxxx>
Date: Thu, 26 Jan 2012 14:49:57 -0300
In-reply-to: <60A3427EF882A54BA0A1971AE6EF03881E15817E@ORD1EXD02.RACKSPACE.CORP>
User-agent: Mozilla/5.0 (X11; Linux i686; rv:9.0) Gecko/20111222 Thunderbird/9.0.1

This is getting really interesting !!

I really hope to see the new Zones code merged into Essex, since we arereally planning a production implementation on Essex, as soon as it ismarked as a release ( nova, keystone, glance & swift, wich also we gotit working on a big lab environment with Milestone 2 )As the expectation, because we mainly use Cactus into production andbecause of the network layer we inherit, today we are using 1 zone perVLAN (thats about 16 hosts of 96GB of RAM each, enough to fill the VLANwith the flavors we use), so yes, the limitation here is the networking.

Thats why we are testing Essex with Quantum, cause we really want toincrease the capacity of a zone ( +50 hosts ) by assigning a/severalnetwork/s to a project on any zone, and the new MultiZone code, to beable to spread the instances across datacenters (and inside thedatacenter also at the same time), we are thinking also (maybe this outof the scope of the subject) that the parent zone might be an instancewith no compute nodes but with all the zones loaded into the db, andmany nova-api spawning on different ports, being load balanced at thesame time just to handle the request for the cloud management.


Just to add a little of our metrics if it helps.

PS: Is the plan to commit the new Zones code into Milestone 3 ? thatwould be fantastic news !


Cheers !

On 01/26/2012 01:40 PM, Sandy Walsh wrote:

Thanks Blake ... all very valid points.
Based on our discussions yesterday (the ink is still wet on thewhiteboard) we've been kicking around numbers in the following ranges:
500-1000 hosts per zone (zone = single nova deployment. 1 db, 1 rabbit)
25-100 instances per host (minimum flavor)
3s api response time fully loaded (over that would be considered afailure). 'nova list' being the command that can bring down the house.But also 'nova boot' is another concern. We're always trying to getmore async operations in there.
Hosts per zone is a tricky one because we run into so many issuesaround network architecture, so your mileage may vary. Network is thelimiting factor in this regard.
All of our design decisions are being made with these metrics in mind.
That said, we'd love to get more feedback on realistic metricexpectations to ensure we're in the right church.
Hope this is what you're looking for?

-S


------------------------------------------------------------------------
*From:* Blake Yeager [blake.yeager@xxxxxxxxx]
*Sent:* Thursday, January 26, 2012 12:13 PM
*To:* Sandy Walsh
*Cc:* openstack@xxxxxxxxxxxxxxxxxxx
*Subject:* Re: [Openstack] [Scaling][Orchestration] Zone changes. WAS:[Question #185840]: Multi-Zone finally working on ESSEX but cant "novalist" (KeyError: 'uuid') + doubts
Sandy,
I am excited to hear about the work that is going on aroundcommunication between trusted zones and look forward to seeing whatyou have created.
In general, the scalability of Nova is an area where I think we needto put additional emphasis. Rackspace has done a lot of work onzones, but they don't seem to be receiving a lot of support from therest of the community.
The OpenStack mission statement indicates the mission of the projectis*:* "To produce the ubiquitous Open Source cloud computing platformthat will meet the needs of public and private cloud providersregardless of size, by being simple to implement and massively scalable."
I would challenge the community to ensure that scale is being giventhe appropriate focus in upcoming releases, especially Nova. Perhapswe need to start by setting very specific scale targets for a singleNova zone in terms of nodes, instances, volumes, etc. I did a quicksearch of the wiki but I didn't find anything about scale targets.Does anyone know if something exists and I am just missing it?Obviously scale will depend a lot on your specific hardware andconfiguration but we could start by saying with this minimum hardwarespec and this configuration we want to be able to hit this scale.Likewise it would be nice to publish some statistics about the scalethat we believe a given release can operate at safely. This would tieinto some of the QA/Testing work that Jay & team are working on.
Does anyone have other thoughts about how we ensure we are all workingtoward building a massively scalable system?
-Blake
On Thu, Jan 26, 2012 at 9:20 AM, Sandy Walsh<sandy.walsh@xxxxxxxxxxxxx <mailto:sandy.walsh@xxxxxxxxxxxxx>> wrote:
    Zones is going through some radical changes currently.

    Specifically, we're planning to use direct Rabbit-to-Rabbit
    communication between trusted Zones to avoid the complication of
    changes to OS API, Keystone and novaclient.

    To the user deploying Nova not much will change, there may be a
    new service to deploy (a Zones service), but that would be all. To
    a developer, the code in OS API will greatly simplify and the
    Distributed Scheduler will be able to focus on single zone
    scheduling (vs doing both zone and host scheduling as it does today).

    We'll have more details soon, but we aren't planning on
    introducing the new stuff until we have a working replacement in
    place. The default Essex Scheduler now will largely be the same
    and the filters/weight functions will still carry forward, so any
    investments there won't be lost.

    Stay tuned, we're hoping to get all this in a new blueprint soon.

    Hope it helps,
    Sandy

    ________________________________________
    From: bounces@xxxxxxxxxxxxx <mailto:bounces@xxxxxxxxxxxxx>
    [bounces@xxxxxxxxxxxxx <mailto:bounces@xxxxxxxxxxxxx>] on behalf
    of Alejandro Comisario [question185840@xxxxxxxxxxxxxxxxxxxxx
    <mailto:question185840@xxxxxxxxxxxxxxxxxxxxx>]
    Sent: Thursday, January 26, 2012 8:50 AM
    To: Sandy Walsh
    Subject: Re: [Question #185840]: Multi-Zone finally working on
    ESSEX but cant   "nova list" (KeyError: 'uuid') + doubts

    Question #185840 on OpenStack Compute (nova) changed:
    https://answers.launchpad.net/nova/+question/185840

       Status: Answered => Open

    Alejandro Comisario is still having a problem:
    Sandy, Vish !

    Thanks for the replies ! let me get to the relevant points.

    #1 I totally agree with you guys, the policy for spawning instances
    maybe very special of each company strategy, but, as you can pass from
    "Fill First" to "Spread First" just adding a "reverse=True" on
    nova.scheduler.least_cost.weighted_sum" and
    "nova.scheduler.distributed_scheduler._schedule" maybe its a harmless
    addition to manipulate (since we are going to have a lot of zones
    across
    datacenters, and many different departments are going to create many
    instances to load-balance their applications, we really preffer
    SpreadFirst to make sure hight availability of the pools )

    #2 As we are going to test essex-3, i would like if you can tell me if
    the zones code from Chris Behrens is going to be added on Final
    Essex /
    Milestone 4, so we can keep testing other features, or you preffer
    us to
    load this as a bug to be fixed since maybe the code that broke is not
    going to have major changes.

    Kindest regards !

    --
    You received this question notification because you are a member
    of Nova
    Core, which is an answer contact for OpenStack Compute (nova).

    _______________________________________________
    Mailing list: https://launchpad.net/~openstack
    <https://launchpad.net/%7Eopenstack>
    Post to     : openstack@xxxxxxxxxxxxxxxxxxx
    <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~openstack
    <https://launchpad.net/%7Eopenstack>
    More help   : https://help.launchpad.net/ListHelp




_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Follow ups

Re: [Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts
From: Thierry Carrez, 2012-01-27

References

[Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts
From: Sandy Walsh, 2012-01-26
Re: [Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts
From: Blake Yeager, 2012-01-26
Re: [Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts
From: Sandy Walsh, 2012-01-26