launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #06971
performance tuesday - changing timeouts, new server progress
Some -very- good news.
With the combination of performance work and the reconfigured
appservers we've now reduced our 99th percentile to 1.59 seconds
(measured on sunday). When I started as TA 10 months back our 99th
percentile was 2.24 seconds. This is slow and tedious work, but
immensely valuable to our users. I'm really happy with the progress
we've made - we're 50% of the way to having a 1 second 99th
percentile.
We had some glitches with OOPS reports on the new servers - I've
updated https://dev.launchpad.net/Foundations/QA/OopsToolsSetup to
cover what we learnt (basically, when a new server is initiated, click
on the load-prefixes button *and* add it to the report). William is
landing a config change to give us more room in the prefixes used for
reports too - we overlapped the edge server prefixes again. Sorry
about the redundant copies of the oops report that were sent out while
we figured this out!
We're down to 316 timeouts - 0.004% of requests, so, with an exception
for Question:+index (which could spike by 200 timeouts a day) the hard
timeout has been dropped to 10 seconds.
If it looks good - and it may - we'll drop to 9 seconds later this
week; meeting Francis challenge for the Epic and leaving only 4
seconds to go to reach my long term timeout of 5 seconds.
Of course, we're going to need better and better code and schemas to
continue improving things, but we're in pretty good shape!
-Rob
Follow ups