← Back to team overview

launchpad-dev team mailing list archive

Re: performance dashboard?

 

On May 7, 2010, at 1:48 AM, Robert Collins wrote:

> I was at devopsdownunder last weekend and saw a demo of a very
> interesting tool. Have a look at
> http://rpm.newrelic.com/v2/accounts/12842/applications/113766 -
> ignoring the bling, its a tool for individual and aggregated
> statistics on *every single request* going through an application
> stack.
> 
> Kind of what we get with oops reports (database time, python time) but
> pervasive rather than only-on-the-broken-requests.
> 
> I think one of the challenges with performance work at the moment -
> and please, correct me if I'm wrong - is that individual developers
> can't easily, routinely see where things are at. Right now, when
> someone asks 'why is xxx slow', the best we can do is:
> - add ++oops++ to the url to trigger an oops
> - wait 3 +- 3 minutes for it to sync
> - look it up on the oops website
> 
> This has two issues:
> - we can't see if its *usual* for that page to be slow, or if its
> unusually slow for one individual.
> - its slow and cumbersome.
> 
> For instance, if we want 100ms page generation, it would be terribly
> useful to be able to see that right now, on average, we're spending
> (say) 60ms in the database.
> 
> Now, I'm not suggesting we go out and invent such a dashboard itself -
> there's going to be a tonne of investment needed to do that, but
> perhaps there is an open source version of this out there already for
> zope apps? Or perhaps we could look at providing a zope plugin to talk
> to newrelic?

(I've been meaning to reply to this for a long time.  I happened to see the email again, so...)

zc.zservertracelog generates the data for some of this, and is a framework for collecting more of it.

It creates somewhat rudimentary but still useful reports.  We generate the data, and we've used the reports occasionally, but we have not made the reports easy to use nor significantly publicized them.

This is the description of a pertinent Foundation task currently in progress by Stuart.  My hope is that this will quickly produce an ugly but very useful view on this kind of data.  My additional hope is that we will find it so useful that we will want to make it better.

---
Broad goal: make a tool that makes it really obvious and easy to discover what pages we ought to improve with memcached (or whatever tool), and track our improvements.

We will be responsible for using this tool to actually show improvement, so we should think of this as building a tool for ourselves.

The following points are my ideas towards that goal.  If you have others, please propose them.

- Make report analyze based on pageids rather than URIs

- Make report generated automatically and available on devpad (or rookery or wherever the right place for this sort of thing is these days)

- Make it clear what time is spent doing work in Zope, and what time is waiting for a worker thread.

- Nice to have: separate out webservice requests in report (or into a separate report?).

- Nice to have: include average DB ticks along with the existing info (number of requests, average length of requests, "impact")
---

Gary


Follow ups

References