← Back to team overview

zeitgeist team mailing list archive

[Bug 639737] Re: Improve insertion times

 

Everyone be sure to read the details here
http://www.sqlite.org/pragma.html so you can make an informed opinion.
Here's what I gather from a quick read. There may be more pragmas we can
use, I haven't trawled the entire document:

 * journal_mode: I don't think OFF is a good choice since we do use
transactions - although not rollback. I am thinking DELETE, TRUNCATE, or
PERSIST which ever performs the best on our target platform.

 * locking_mode: I think we can safely go with EXCLUSIVE since ZG is
really the only process ever that should access the DB. It comes at the
cost of you having to kill zg-daemon if you want to manually insoect the
db - but I think we can live with that

 * synchronous: I think it's ok to use NORMAL. Especially if we write a
backup tool which I think we need to do anyway (streaming events as json
into a gzip stream should be fast and light).

Generally: We should do some *serious* dog fooding and see how likely we
are to mess up our DBs before we release this into the wild... kill -9,
hard poweroffs, and what have we!

-- 
Improve insertion times
https://bugs.launchpad.net/bugs/639737
You received this bug notification because you are a member of Zeitgeist
Framework Team, which is subscribed to Zeitgeist Framework.

Status in Zeitgeist Framework: Triaged

Bug description:
We insert pretty slowly with an average of 0.15 seconds for one event on my core i5 2.5 GHz beast.

RainCT had some optimization possibilities:
1) PRAGMA synchronous=OFF
2) PRAGMA journal_mode=OFF

The Chat:
------------------------------------------------------------
<kamstrup> I think we are - but I can't recall... in case of failed transactions - but I don't even know if we use transactions these days...
<seif> <RainCT> try synchronous=OFF
<seif> <RainCT> but it can corrupt your database if your phone dies while ZG is inserting
<seif> <RainCT> and journal_mode=MEMORY
<seif> <RainCT> or OFF since we don't use rollback anyway
<seif> so maybe journal_mode = OFF is a good start?
<kamstrup> okay, he's probably right...
<kamstrup> 'grep -Ri rollback _zeitgeist/' is your friend :-)
<kamstrup> apparently we are not using rollback...

More info can be found here: http://www.sqlite.org/pragma.html
------------------------------------------------------------

In order to get a better picture of what's going on, can you please try to get some more information, like:
1) How many events are in your database?
2) What's the insertion time for one event into an empty db?
3) Out of this 0.15 secs, how many time is spend in our python code, and what's the time of the actual sql action?
4) How much faster is adding 10 events at once compared to adding them one at a time?
5) You think 0.15 secs is slow for inserting one event, what time do you expect, and why?





Follow ups

References