← Back to team overview

openstack team mailing list archive

Re: A single cross-zone database?


Good point that pagination makes this harder.  However, thankfully the limit
is implemented using a token (the last ID seen), not an absolute offset, so
I believe we can still do pagination even in loosely coordinated DBs.  Good
job whoever dodged that bullet (Jorge?)

(Aside #1: Sorting by uptime time - now _that's_ expensive, and has to be
done for every page.)
(Aside #2: What is the point of pagination really?  I mean how many API
users that are getting a list actually stop before fetching them all?)


On Wed, Mar 16, 2011 at 9:23 AM, Paul Voccio <paul.voccio@xxxxxxxxxxxxx>wrote:

>  Sandy,
>  Not only is this expensive, but there is no way I can see at the moment
> to do pagination, which is what makes this really expensive. If someone
> asked for an entire list of all their instances and it was > 10,000 then I
> would think they're ok with waiting while that response is gathered and
> returned. However, since the API spec says we should be able to do
> pagination, this is where asking each zone for all its children every time
> gets untenable.
>  Looking forward to the discussion. More below.
>   From: Sandy Walsh <sandy.walsh@xxxxxxxxxxxxx>
> Date: Wed, 16 Mar 2011 14:53:37 +0000
> To: "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
> Subject: [Openstack] A single cross-zone database?
>   Hi y'all, getting any sleep before Feature Freeze?
>  As you know, one of the main design tenants of OpenStack is Share Nothing
> (where possible). http://wiki.openstack.org/BasicDesignTenets
>  That's the mantra we've been chanting with Zones. But it does cause a
> problem with a particular Use Case:
>  *"Show me all Customer X Instances, across all Zones."*
>  This is an expensive request. We have to poll all zones and ask them to
> return a list of matching instances.
>  There has been some water cooler chat about some things we could do to
> make this more efficient in the near term. One proposal has been to assume a
> single database, replicated across zones. I'll call it SDB for short. With
> SDB we can have a join table that links Zone to Instance ... keeping a
> record of all instances across zones. Maybe it's a completely separate set
> of tables? Maybe it's a separate replicated db? The intention is to let us
> talk to the appropriate zone directly.
>  Sure, there are a ton more optimizations we could make if we go further
> with SDB. We could store all the Zone capabilities in the db to make Zone
> selection faster. We could store all the customers in the db to make
> multi-tenant easier. But that's not what we're talking about here. We're
> talking about the* bare minimum *required to make the get_instances query
> fast.
>  Conversely, there are issues with a single DB. The largest being the
> implication it has on Bursting (Hybrid Private/Public clouds) ... a pretty
> funky feature imho.
>  Personally, I think the same query gains can be obtained by creating a
> separate db using off-the-shelf ETL tools to create cache/read-only db's.
> http://en.wikipedia.org/wiki/Extract,_transform,_load
>  Isn't the hard part keeping this in sync with what the zones have?
>  I was considering SDB for Zones (phase 4), but for now, I'm going to
> stick with the original plan of separate databases (1 per zone) and see what
> the performance implications are.
>  What are your thoughts on this issue?
>  ... let the games begin!
>  -S
>  Confidentiality Notice: This e-mail message (including any attached or
> embedded documents) is intended for the exclusive and confidential use of the
> individual or entity to which this message is addressed, and unless otherwise
> expressly indicated, is confidential and privileged information of Rackspace.
> Any dissemination, distribution or copying of the enclosed material is prohibited.
> If you receive this transmission in error, please notify us immediately by e-mail
> at abuse@xxxxxxxxxxxxx, and delete the original message.
> Your cooperation is appreciated.
>  _______________________________________________ Mailing list:
> https://launchpad.net/~openstack Post to : openstack@lists.launchpad.netUnsubscribe :
> https://launchpad.net/~openstack More help :
> https://help.launchpad.net/ListHelp
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp