openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #01358
Re: A single cross-zone database?
Inline...
> Can someone explain _why_ we need caching?
>
> We don't *need* caching - it is simply the most direct way to avoid
> multiple expensive calls.
So if we don't need it...? You cite avoiding expensive calls, but I think
it's entirely unproven that those call are too expensive.
If it makes you happier, think of the child zones as a cache that just
happens to not have any inconsistency issues. I see Eric's point that it
would be better to integrate it up-front, but let's get ten zones working
first, before we worry out how we can scale to ten thousand zones.
I think this should be Sandy's call. My interpretation is that he'd like to
implement a correct algorithm first, to understand where the bottlenecks
are. I think that's the right way. Equally, if Sandy says he wants to do
caching, then that's fine with me also.
> > With our approach to pagination, without caching, the answer is always
> correct: each query always returns the next {limit} values whose ID is >=
> {start-id}.
>
> But for this example, you have to traverse *all* the zones in order
> to get the list of instances for a customer. How else would you define
> "instances 101-199 of 823 total instances"? How would you know where #101
> is?
>
You do need to traverse all zones, correct. (Though note that you don't
need to know where #101 is) I think you may have missed the point about
token-based pagination: the query is not "instances at indexes 101-199 of
823", it is e.g. "the next 100 instances whose id > 'i-198273'.
But how is this a problem? How many zones are we talking about? If this is
bigger than a few dozen, why?
>
> > I agree that in practice this means that there's no way to guarantee you
> get all values while they're changing behind the scenes, but this is a
> shortcoming of pagination, not caching. Caching doesn't solve this, it just
> creates thornier edge cases. The solution here is a more sensible ordering
> than 'last modified', and I question the value of pagination (other than for
> compatibility)
>
> The question is performance in this particular use case. As pvo
> said, you can accept a certain level of inconsistency as a trade-off for
> scaling. We would then allow the TTL to be controlled so that different
> OpenStack deployments can adjust this to their needs.
>
Except every binding to the CloudServers API has a hacky way to bypass
caching. We have direct evidence that people don't want to accept these
caching trade-offs / edge cases.
It sounds to me like you're saying: "But caching is web-scale" :-)
Justin
References