openstack team mailing list archive

Thread
Date

Re: Multi-Cluster/Zone - Devil in the Details ...

To: Eric Day <eday@xxxxxxxxxxxx>
From: Jay Pipes <jaypipes@xxxxxxxxx>
Date: Wed, 16 Feb 2011 15:59:10 -0500
Cc: "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20110216191706.GB1167@oddments.org>

On Wed, Feb 16, 2011 at 2:17 PM, Eric Day <eday@xxxxxxxxxxxx> wrote:
> On Wed, Feb 16, 2011 at 01:02:22PM -0500, Jay Pipes wrote:
>> >> [Sorry, yes the instance name is passed in on the request, but the instance ID is what's needed (assuming of course instance ID is unique across zones.)]
>> >
>> >        The ID is determined early in the process; well before the request to create an instance is cast onto the queue, and is returned immediately.
>>
>> The instance ID is constructed in nova.compute.api.API.create(), which
>> is called by nova.api.openstack.servers.Controller.create().
>>
>> In other words, Sandy needs to find an appropriate zone to place an
>> instance in. Clearly, this logic must happen before the instance is
>> created in the database.
>
> On top of this, the instance is created in the DB before being passed
> to the scheduler too, which is obviously a problem since the scheduler
> may proxy this to another zone, and this old instance row in the
> DB is abandoned. We need to not touch the DB until we get to the
> compute node, which was what I was working on as a prerequisite for
> this blueprint during Bexar. This, as well as some other fundamental
> changes, are required before we can move too far along with multi-zone.
>
> We never want to generate the ID any other place besides the final
> zone. We should be using a zone-unique ID + zone name for instance
> (and other object) naming.

++, and your URI naming scheme may work out here...

But, as I mentioned to Sandy on IRC, caching and performance should be
a secondary concern. The primary concern, right now, is just making
this all work. In other words, getting multiple zones up and running,
each knowing about their little slice of the pie, and communicating up
and down the scheduler levels properly.

Optimization can come later.

>> >        I know that. I'm just stating that this is a natural consequence of the decision not to use a centralized db.
>>
>> The data set queried in the database used for a zone only contains
>> information necessary for the scheduler workers in that zone to make a
>> decision, and nothing more.
>
> ++
>
>> >>>> One alternative is to make Host-Best-Match/Zone-Best-Match stand-alone query operations.
>> >>>
>> >>>        I don't really like this approach. It requires the requester to know too much about the implementation of the service: e.g, that there are zones, and that an instance will be placed in a particular zone. I would prefer something more along the lines of:
>> >>>
>> >>> a. User issues a create-instance request, supplying the name of the instance to be created.
>> >>> b. The top-level zone that receives the request does a zone-best-match and/or host-best-match call to determine where the instance will be created.
>> >>> c. The top-level zone then passes the create-instance request to the selected zone/host.
>
> ++
>
>> Why are we assuming a requester doesn't know much about the
>> implementation of the service? I mean, the requester is going to be an
>> application like the Cloud Servers console, not just some random user.
>
> But it can be some random user, many folks script against the public API.

I wasn't saying it *couldn't* be a random user, just that in many
cases, it is a "user" like a Control Panel, that can have a better
understanding of the underlying environment setup...

>>  Of course the requester knows something about the implementation of
>> the service, and if they don't, the work Sandy did in the first phase
>> of this blueprint allows the requester to query the admin API for
>> information about the zones...
>
> Pushing the zone list out to the client just punts on the whole routing
> issue. That means the client needs to do the work instead, and need to
> either scan or track the zone for each instance they create. Some folks
> have said they don't want to expose any of their topology and would
> most likely want everything routing through a top-level API endpoint.
>
> For ease of use for the API user, and to accommodate deployments that
> don't expose topology, we need to support routing of all requests
> inside the parent zones.

Sure, I don't disagree here. I was just suggesting to Sandy that some
*future* optimization might entail a set of clients that had a bit
more knowledge about the underlying environment, that's all.

>> >> [But what about subsequent actions ... the same zone-search would have be performed for each of them, no?]
>> >
>> >        This was one of the issues we discussed during the sprint planning. I believe (check with cyn) that the consensus was to use a caching strategy akin to DNS: e.g., if zone A got a request for instance ID=12345, it would check to see if it had id 12345 in its cache. If not, it would ask all of its child nodes if they knew about that instance. That would repeat until the instance was found, at which point every upstream server would now know about where to reach 12345.
>>
>> Agreed. Each "level" or zone in the overall architecture would cache
>> (in our case, cache means a record in the zone's database) information
>> about its subordinate nodes (nodes being instances or other zones,
>> depending on the "level" of the zone in the overall architecture).
>
> This doesn't help the 'list all instances' search. This would be
> very expensive when dealing with a large number of accounts and
> instances. We need a more active caching policy, which ends up being
> more of a replication subset than a cache. Initially we can just
> fanout the query to just make it work, but to be usable in any large
> capacity, we need a much smarter data model underneath. These are
> all things we discussed at the last design summit, if folks remember
> those discussions. :)

Agreed, but this is IMHO an optimization. Sandy should first focus on
getting the zone stuff working, then we can all optimize away ;)

-jay

Follow ups

Re: Multi-Cluster/Zone - Devil in the Details ...
From: Eric Day, 2011-02-16

References

Multi-Cluster/Zone - Devil in the Details ...
From: Sandy Walsh, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Ed Leafe, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Sandy Walsh, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Ed Leafe, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Jay Pipes, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Eric Day, 2011-02-16