← Back to team overview

nova-orchestration team mailing list archive

Re: Tactical Scheduler/Scalability plan ...

 

Let's see ...

1. Deployment models of schedulers

For the distributed scheduler, yes. Change and Simple don't really care since they don't have state per se.
I think with the "current workload" fudge will help with multiple schedulers, but we may need to look at some other locking scheme for multiple schedules (like our zookeeper discussion)

2. The list operation itself isn't so bad, but it's having to do follow up calls for the details where the expense is (like the network info I mentioned). Also, when we get into child zones we need to aggregate the child zone instances in the top-level zone so we have to forward the 'nova list' calls to the children each time (where the # of total instances would be many times more than 100k)

The scheduler would create the new row when created, but the row updates would be done by the compute nodes themselves. 

3. normally, things like compute node host names don't get leaked out of a zone. But in a trusted environment we can get some optimizations from letting this out. For example, we can directly route to a specific child zone and not have to ask each of them for a build plan.  All of this would be stripped off when the public API returns info to the caller.

Hopefully this helps and didn't confuse matters :)

-S

________________________________________
From: Yun Mao [yunmao@xxxxxxxxx]
Sent: Thursday, December 01, 2011 1:37 PM
To: Sandy Walsh
Cc: nova-orchestration@xxxxxxxxxxxxxxxxxxx; nova-scaling@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Nova-orchestration] Tactical Scheduler/Scalability plan ...

Hi Sandy,

Thanks for the writeup. I have a few clarification questions first:

* deployment model of schedulers: should I consider the diablo release
only support a single scheduler instance? In the next release we can
deploy multiple of them in a stateless way and the way they avoid
concurrency issues is to use the CapacityCache table with DB row
locks?

* nova list performance: do we have some numbers to support that the
performance is an issue? My experience with query performance at 100k
rows with the right indexes built is not that bad. Also my first
impression is that you are maintaining a materialized view
(de-normalized pre-computed table). That means that it's information
is available for elsewhere in the db. But later you mentioned that you
need notifier to update state when new instances are created. That
sounds like new information that didn't exist in the db. Did I
mis-understand this?

* trusted zone: what is the information you want to leak from a child
zone to a parent zone? Who will be on the receiving end at the parent?

Thanks,

Yun

On Thu, Dec 1, 2011 at 6:53 AM, Sandy Walsh <sandy.walsh@xxxxxxxxxxxxx> wrote:
> Hi y'all,
>
> Attached are some notes from the whiteboard noodling myself, comstud and anotherjesse did last week:
> http://wiki.openstack.org/EssexSchedulerImprovements
>
> It addresses some tactical problems we're facing and, hopefully, sets up a pattern for further improvements and hooks for things like orchestration.
>
> Look forward to your feedback!
>
> Cheers,
> Sandy
>
> --
> Mailing list: https://launchpad.net/~nova-orchestration
> Post to     : nova-orchestration@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~nova-orchestration
> More help   : https://help.launchpad.net/ListHelp


Follow ups

References