← Back to team overview

launchpad-dev team mailing list archive

Re: CodeBrowse: The Path Forward

 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


...

> I did another round of concrete timings on my machine using 'launchpad'
> as the data source. This isn't as big as emacs, but it is slow enough to
> show the results I'm trying to highlight.
> 
> lp-pqm  trunk   action
> 6.721   19.622  First view of launchpad/devel/changes
> 0.156    0.072  Reload launchpad/devel/changes
> 0.720    0.090  Restart loggerhead, reload launchpad/devel/changes
> 
> 6.345    2.208  Load launchpad/db-devel/changes (shared cache)
> 0.171    0.036  Reload launchpad/db-devel/changes
> 0.506    0.029  Restart loggerhead, reload launchpad/db-devel/changes
> 
> 		pull 1 new revision to launchpad/devel/changes
> 8.349	 0.190	Reload launchpad/devel/changes
> 
> 
> So concretely, building the cache from scratch is slower. It takes 20s
> to build whereas it used to take 6.7s to build. All other requests are
> faster, often *significantly* so.

https://code.launchpad.net/~jameinel/loggerhead/incremental_import/+merge/47722

Changes the time for the first view from:
  6.721   19.622  First view of launchpad/devel/changes
to
  6.721   11.760  First view of launchpad/devel/changes
Or down to 1.7:1 instead of 2.9:1.

Quick summary of the change:

  if the historydb.sql file is being created, unset the incremental
  flag. This switches the code to using get_known_graph_ancestry()
  which has been tuned in bzrlib to both grab the whole ancestry
  faster (special lookahead tricks), and use a Pyrex KnownGraph object
  for computing things like gdfo.

If we go with trunk loggerhead this will:

 a) work if we use one cache directory for each branch (as it is done
    today)
 b) work if we use one cache directory for each project (which is the
    ideal case)
 c) Not work if we do one-big cache file, which is non ideal for lots
    of other reasons.


Other things to mention are that I did a fair amount of code cleanup in
trunk. Specifically, the 'changes' view was pulling in the whole
ancestry just to do pagination. (It needed the revision id of a revision
X revisions ago, so that it could create a link to it.) I changed the
code so that it would request history from current tip to tip - N,
rather than all ancestry.

Which is where the 'reload current page' times went from 150ms to 76ms.

So there is a *lot* of goodness in trunk that we are missing out from
the pqm branch.



Another page which I poked at, which I had spent time on has similar
results:
lp-pqm  trunk   action
6.858   13.441  launchpad/devel/files
0.401    0.366  reload launchpad/devel/files
0.819    0.435  restart and reload launchpad/devel/files



As for security and privacy concerns. I'm pretty confident about the
historydb cache, because I know what is stored there. There is also
another sql cache the 'filechanges.sql' cache. This was written because
computing inventory deltas pre-2a was very slow. I think we could
disable it completely now, and then we don't have to worry about a stale
cache, etc.

The only pages that use it are the "revision_ui.py" and "revlog_ui.py".
Looking at revision_ui, the times are (this is reload N times, best-of-N):

cache miss	cache hit	no cache
851ms		810ms		828ms

Note that the cache hit rarely went about 900ms, while the cache miss
and no cache results did get into 1000ms or so. So the cache does help,
but not a lot, and adds a fair amount of complexity.

The one place where the file cache helps a lot is old format branches.
But 2a has been out for a long time. And while Launchpad's pqm branch of
loggerhead is still 1.9, for any branch which is large enough that we
care about performance, I'm pretty sure it has been migrated to 2a for a
while now.


John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1B3XoACgkQJdeBCYSNAANM/gCcC8GDtMhydk89SzTqcUNKEvnK
Zp4AoMA9Sw+LFlr4FE0vfZh8MtlRuNCo
=qjrs
-----END PGP SIGNATURE-----



Follow ups

References