← Back to team overview

launchpad-dev team mailing list archive

Re: memcache, responsiveness and load {short story, lets turn memcache off}

 

On Wed, Aug 4, 2010 at 5:28 PM, Robert Collins
<robert.collins@xxxxxxxxxxxxx> wrote:
> On Wed, Aug 4, 2010 at 7:33 PM, Stuart Bishop
> <stuart.bishop@xxxxxxxxxxxxx> wrote:
>> On Wed, Aug 4, 2010 at 9:46 AM, Robert Collins
>> <robert.collins@xxxxxxxxxxxxx> wrote:
>>
>>> Doing this will immediately close a half-dozen bugs, and focus our
>>> timeout and performance efforts closer to the actual source of our
>>> problems.
>>
>> Which bugs? The only open ones I'm aware of are feature requests which
>> I don't think are on anyone's radar to address.
>
> As Martin says, milestone and bugs pages are caching inappropriately.
> https://bugs.edge.launchpad.net/launchpad/+bug/601051 was originally
> filed in a pretty general way, got refreshed today.
> The 'hot bugs' list in a bug context is cached and is jarring (because
> a common thing to do with a hot bug is to triage it).
> https://bugs.edge.launchpad.net/malone/+bug/602936

So we need to flag these wontfix, or turn off caching one day if we
consider them bugs. Turning off caching for these pages is trivial -
remove the cache: attributes in the TAL.

=== modified file 'lib/lp/bugs/templates/bugtarget-bugs.pt'
--- lib/lp/bugs/templates/bugtarget-bugs.pt	2010-06-15 19:37:37 +0000
+++ lib/lp/bugs/templates/bugtarget-bugs.pt	2010-08-04 10:59:03 +0000
@@ -107,9 +107,7 @@
         --></script>
       </tal:has_bugtasks>

-      <tal:has_hot_bugs condition="view/hot_bugs_info/bugtasks"
-        content="cache:private, 10 minute">
-        <!-- hot_bugs_info/bugtasks cache:private 10 minute -->
+      <tal:has_hot_bugs condition="view/hot_bugs_info/bugtasks">
         <h2>Hot bugs</h2>

         <table class="listing" id="hot-bugs"

https://code.edge.launchpad.net/~stub/launchpad/bug-602936-hotbug-caching/+merge/31735

The milestones issue would likely be just as trivial, although I'm
hesitant to turn it off without Curtis weighing in - this was
certainly a problem page and we shouldn't turn it back into a problem
page until someone has time to tackle it.


> This bug proposes using memcached to address cold cache problems by
> preloading a lot of batch sizes into it (not a great strategy IMO).
> https://bugs.edge.launchpad.net/rosetta/+bug/534203

memcached is listed as one of many options to help tackle the larger
problem. This isn't a memcached bug.


>> I'm not sure why turning the facility off will help peoples focus -
>> memcached was never about stopping timeouts and this has been
>> repeatedly stressed.
>
> That message and developer actions are mismatched.

It isn't. I think it has been used in two cases to reduce the
occurrence of timeouts and to buy people breathing room, but I think
everyone is aware that caching cannot stop timeouts. I've been the
person who has added almost all of the memcached usage into the
system, and all of it has been to improve throughput and reduce
overall load. Why spend seconds rerendering a bugs comments when they
never change?


>> Using the facility will slow down initial loads,
>> and may improve the median page load time. It won't fix timeouts. It
>> may reduce the frequency of timeouts. The memcached infrastructure is
>> about improving scalability and overall performance, and throwing away
>> our 24% hit rate seems rather pointless (better rate than I was
>> expecting actually, but I guess the places it is being used have been
>> specifically targeted). The same rationales for turning off would be
>> applied to turning off Squid, which fills pretty much the same role
>> but more so by caching 100% of our unauthorized access.
>
> Caching anonymous stuff is rather easier, because anonymous queries
> can't create a situation with stale data - if the same user has a
> logged in and an unlogged in account they'll detect the difference,
> but not otherwise.

Sure. It is a facility developers can't control so they can't create bugs.

>
>> Turning it off is also rather problematic for foundations scalability
>> and performance work - it is the most likely replacement for OAuth
>> nonce and Session storage.
>
> So, turning it off may be over-broad; removing it where it is being
> abused is a more accurate way of stating my intention; I'd be happy if
> we turned it off before the rollout, landed a patch to rollback the
> inappropriate uses, and then turned it back on for appropriate uses. I
> certainly don't want to impede scalability and performance work. Using
> memcache for that does raise some concerns though: wouldn't a system
> outage of memcache (like say, a kernel vulnerability forcing a reboot)
> cause all users to have to log in again? We can take this to a
> different thread.

I think it is fine to roll back inappropriate uses. It still lets
people use it where it is appropriate and keeps the existing wins in
place. I also think it appropriate if we don't even consider using it
for something until any related timeouts have been fixed through other
means - it might be useful as a band aid but that isn't good for the
long term.

Yes, using memcached for OAuth and Session raises issues. For Session,
it is because we need to treat it as volatile storage and memcached
outages or glitches would log people out. Perhaps this is fine,
perhaps we need a hybrid system - its on foundations radar and
discussed, but no commitments on direction. OAuth nonces would have
been on memcached already as volatility isn't the issue there - the
issue that turned this into something non-trivial was transactional
integrity.

















-- 
Stuart Bishop <stuart@xxxxxxxxxxxxxxxx>
http://www.stuartbishop.net/



References