openstack team mailing list archive

Thread
Date

Re: Multi-Cluster/Zone - Devil in the Details ...

To: Eric Day <eday@xxxxxxxxxxxx>, Soren Hansen <soren@xxxxxxxxxx>
From: Sandy Walsh <sandy.walsh@xxxxxxxxxxxxx>
Date: Thu, 17 Feb 2011 02:51:53 +0000
Accept-language: en-US
Cc: "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20110216222132.GA7332@oddments.org>
Thread-index: AcvN7d6TWsIT8+JqQNSFBmFu+D+IzwAO+PIA//+f+6+AAGumAIAAUrsAgAACbgD//+PuKQ==
Thread-topic: [Openstack] Multi-Cluster/Zone - Devil in the Details ...

Thanks for the feedback Soren.

I agree that caching can be error prone. When I first read your email I started to panic that we were taking the wrong tack. But, the more I think about it, it's probably not that bad.

For basic routing operations, I *think* the caching should be fine. 

"Do you have any instances for customer X?"
"Do you have GPU-based hosts?"
"Are you in North America?"
"Do you have Ubuntu 10.10 images?"

These capabilities are largely static, so caching should be fine. 

But at provisioning time, we need to bypass the cache and go direct to the zone. Particularly the create_volume() and create_instance() operations. Then, we need to go to each child zone and ask things like:

"Do you have 512M ram, 20G disk and 300GB bandwidth available now?"

Or, am I over-simplifying this scenario?

-S

________________________________________
From: openstack-bounces+sandy.walsh=rackspace.com@xxxxxxxxxxxxxxxxxxx [openstack-bounces+sandy.walsh=rackspace.com@xxxxxxxxxxxxxxxxxxx] on behalf of Eric Day [eday@xxxxxxxxxxxx]
Sent: Wednesday, February 16, 2011 6:21 PM
To: Soren Hansen
Cc: openstack@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Openstack] Multi-Cluster/Zone - Devil in the Details ...

Good points Soren, and this is why I was suggesting we not solve this
problem with a cache, but instead an eventually consistent replication
stream of the aggregate data.

-Eric

On Wed, Feb 16, 2011 at 11:12:50PM +0100, Soren Hansen wrote:
> 2011/2/16 Ed Leafe <ed@xxxxxxxxx>:
> > This was one of the issues we discussed during the sprint planning. I believe (check with cyn) that the consensus was to use a caching strategy akin to DNS: e.g., if zone A got a request for instance ID=12345, it would check to see if it had id 12345 in its cache. If not, it would ask all of its child nodes if they knew about that instance. That would repeat until the instance was found, at which point every upstream server would now know about where to reach 12345.
>
> Has any formal analysis been done as to how this would scale?
>
> I have a couple of problems with this approach:
>
>  * Whenever I ask something for information and I get out-of-date,
> cached data back I feel like I'm back in 2003. And 2003 sucked, I
> might add.
>  * Doesn't this caching strategy only help if people are asking for
> the same stuff over and over? It doesn't sound very awesome if 100
> requests for new stuff coming in at roughly the same time causes a
> request to be sent to every single compute node (or whereever the data
> actually resides). I'm assuming breadth-first search here, of course.
>
>
> --
> Soren Hansen
> Ubuntu Developer    http://www.ubuntu.com/
> OpenStack Developer http://www.openstack.org/
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use of the
individual or entity to which this message is addressed, and unless otherwise
expressly indicated, is confidential and privileged information of Rackspace.
Any dissemination, distribution or copying of the enclosed material is prohibited.
If you receive this transmission in error, please notify us immediately by e-mail
at abuse@xxxxxxxxxxxxx, and delete the original message.
Your cooperation is appreciated.

References

Multi-Cluster/Zone - Devil in the Details ...
From: Sandy Walsh, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Ed Leafe, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Sandy Walsh, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Ed Leafe, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Soren Hansen, 2011-02-16
Re: Multi-Cluster/Zone - Devil in the Details ...
From: Eric Day, 2011-02-16