← Back to team overview

launchpad-dev team mailing list archive

Improving testsuite performance through use of a ramdisk (was Re: Test suite taking 4.5 hours on EC2)


I'm revisiting an old thread on the internal mailing list (posted before we went open source). I've recently purchased a new desktop development machine with an Intel i7-920 (quad core, 8 threads), 12G of ram and a 7200rpm 1.5T drive.

The discussion started out talking about why the test suite takes so long to run and what can be done about it. One of the ideas was to run PostgreSQL on a ramdisk. Our story continues...

On Jun 11, 2009, at 6:19 AM, Julian Edwards wrote:

A little related, something that Tim suggested ages ago was to run postgres in
a ramdisk.  The runes to do that are:

/etc/init.d/postgresql-8.3 stop
cp -aR /var/lib/postgresql /var/lib/postgresql-disk
mount -t tmpfs none /var/lib/postgresql
cp -a /var/lib/postgresql-disk/* /var/lib/postgresql
/etc/init.d/postgresql-8.3 start

The last time I used it was way back when the test suite took only an hour and
it cut my run time to 50 minutes.

I've just tried this on the above new machine, but I was similarly underwhelmed with the results. Well, the new machine itself is awesome because on my old machine I couldn't even complete the test suite. Now, a vanilla 'bin/test' on a straight up Karmic install runs in about 129 minutes (with 14 failures and 2 errors, but we'll ignore that for now ;).

So I put /var/lib/postgresql on a ramdisk and re-ran the test suite. This time it took only 124m. Both machines were otherwise idle, though I don't think it would make a difference. The 4% savings doesn't seem worth the effort.

A few observations. I really didn't collect much additional data, but I did watch System Monitor and top during parts of these runs. I never saw the ram usage get above maybe 2.4G during the ramdisk run, but it was only slightly less during the harddisk run. Occasionally one or more of the cpus would hit 100% simultaneously, but most stayed effectively idle (I guess that makes sense), and none were pegged at 100% for anything more than 10 seconds at a time. System Monitor in Karmic doesn't seem to log disk i/o so I can't say anything about that. load average never when above 1 while I was watching it.

Anyway, it was interesting and a bit disappointing. I'll try another run perhaps over the weekend trying to capture some i/o metrics. If anybody has interesting ideas for tests and data capture, follow up to this thread.

I do think that increasing the parallelism of our test suite would be a fruitful avenue to explore.


Attachment: PGP.sig
Description: This is a digitally signed message part

Follow ups