← Back to team overview

openstack team mailing list archive

Re: Caching strategies in Nova ...


On 3/23/12 8:56 AM, "Sandy Walsh" <sandy.walsh@xxxxxxxxxxxxx> wrote:

>On 03/23/2012 09:44 AM, Gabe Westmaas wrote:
>> I'd prefer to just set a different expectation for the user.  Rather
>>than worrying about state change and invalidation, lets just set the
>>expectation that the system as a whole is eventually consistent.  I
>>would love to prevent any cache busting strategies or expectations as
>>well as anything that requires something other than time based data
>>refreshing.  We can all agree, I hope, that there is some level of
>>eventual consistency even without caching in our current system.  The
>>fact is that db updates are not instantaneous with other changes in the
>>system; see snapshotting, instance creation, etc.
>I think that's completely valid. The in-process caching schemes are
>really just implementation techniques. The end-result (of view tables vs
>key/value in-memory dicts vs whatever) is the same.
Agreed! As long as the interface doesn't imply one implementation over
another (see below).

>> What I'd like to see is additional fields included in the API response
>>that how old this particular piece of data is.  This way the consumer
>>can decide if they need to be concerned about the fact that this state
>>hasn't changed, and it allows operators to tune their system to whatever
>>their deployments can handle.  If we are exploring caching, I think that
>>gives us the advantage of not a lot of extra code that worries about
>>invalidation, allowing deployers to not use caching at all if its
>>unneeded, and paves the way for view tables in large deployments which I
>>think is important when we are thinking about this on a large scale.
>My fear is clients will simply start to poll the system until new data
>magically appears. An alternative might be, rather than say how old the
>data is, how long until the cache expires?
Definitely a valid concern.  However, I kind of expect that many users
will still poll even if they know they won't get new data until X time.
In addition, I think if we say how old the data is, it still implies too
much knowledge unless we go with a strict caching system.  I'd love for us
to leave the ability for us to update that data asynchronously, and
hopefully really quickly, except in the cases where the system is under
unexpected load.  Basically, if we give them that information, and we miss
it, that¹s a call in to support, not to say they won't call in if it takes
too long to update, of course.

Also, if its hitting a cache or something optimized for GETs, hopefully we
can handle lots of polling by adding more API nodes.


>> Gabe
>>> -----Original Message-----
>>> From: openstack-
>>> bounces+gabe.westmaas=rackspace.com@xxxxxxxxxxxxxxxxxxx
>>> [mailto:openstack-
>>> bounces+gabe.westmaas=rackspace.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of
>>> Sandy Walsh
>>> Sent: Friday, March 23, 2012 7:58 AM
>>> To: Joshua Harlow
>>> Cc: openstack
>>> Subject: Re: [Openstack] Caching strategies in Nova ...
>>> Was reading up some more on cache invalidation schemes last night. The
>>> best practice approach seems to be using a sequence ID in the key. When
>>> you want to invalidate a large set of keys, just bump the sequence id.
>>> This could easily be handled with a notifier that listens to instance
>>> changes.
>>> Thoughts?
>>> On 03/22/2012 09:28 PM, Joshua Harlow wrote:
>>>> Just from experience.
>>>> They do a great job. But the killer thing about caching is how u do
>>>> the cache invalidation.
>>>> Just caching stuff is easy-peasy, making sure it is invalidated on all
>>>> servers in all conditions, not so easy...
>>>> On 3/22/12 4:26 PM, "Sandy Walsh" <sandy.walsh@xxxxxxxxxxxxx> wrote:
>>>>     We're doing tests to find out where the bottlenecks are, caching
>>>>is the
>>>>     most obvious solution, but there may be others. Tools like
>>>>memcache do
>>> a
>>>>     really good job of sharing memory across servers so we don't have
>>>>     reinvent the wheel or hit the db at all.
>>>>     In addition to looking into caching technologies/approaches we're
>>>>     together some tools for finding those bottlenecks. Our first step
>>>>     be finding them, then squashing them ... however.
>>>>     -S
>>>>     On 03/22/2012 06:25 PM, Mark Washenberger wrote:
>>>>     > What problems are caching strategies supposed to solve?
>>>>     >
>>>>     > On the nova compute side, it seems like streamlining db access
>>>>     > api-view tables would solve any performance problems caching
>>>>     > address, while keeping the stale data management problem small.
>>>>     >
>>>>     _______________________________________________
>>>>     Mailing list: https://launchpad.net/~openstack
>>>>     Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>>>>     Unsubscribe : https://launchpad.net/~openstack
>>>>     More help   : https://help.launchpad.net/ListHelp
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp

Follow ups