zeitgeist team mailing list archive
-
zeitgeist team
-
Mailing list archive
-
Message #02336
[Bug 672965] Re: find_event() queries with timerange other than TimeRange.always() are slow
** Also affects: zeitgeist/0.7
Importance: Low
Assignee: Markus Korn (thekorn)
Status: In Progress
--
find_event() queries with timerange other than TimeRange.always() are slow
https://bugs.launchpad.net/bugs/672965
You received this bug notification because you are a member of Zeitgeist
Framework Team, which is subscribed to Zeitgeist Framework.
Status in Zeitgeist Framework: In Progress
Status in Zeitgeist Framework 0.7 series: In Progress
Bug description:
When running the attached script you can see that when giving a timerange which does not start at 0 and end at maxint the queries get 300% slower.
python sample_timerange_query.py
DEBUG:zeitgeist.sql:Got query:
QUERY:
SELECT DISTINCT id FROM event_view GROUP BY actor ORDER BY COUNT(actor) DESC, timestamp DESC LIMIT 6 ([])
PLAN:
[0, 0, u'TABLE event WITH INDEX event_actor ORDER BY']
DEBUG:zeitgeist.engine:Found 6 event IDs in 0.080176s
---> MostPopularActor: get len(ids)=6 using .find_eventids() in 0.080375s
DEBUG:zeitgeist.sql:Got query:
QUERY:
SELECT DISTINCT id FROM event_view WHERE (timestamp >= ? AND timestamp <= ?) GROUP BY actor ORDER BY COUNT(actor) DESC, timestamp DESC LIMIT 6 ([u'1', u'50000'])
PLAN:
[0, 0, u'TABLE event WITH INDEX event_timestamp']
DEBUG:zeitgeist.engine:Found 6 event IDs in 0.260648s
---> MostPopularActor: get len(ids)=6 using .find_eventids() in 0.260838s
The reason is simply that the second query uses the 'wrong' index.
References