← Back to team overview

zim-wiki team mailing list archive

Re: caching and progressively slower page loads

 

Thank you very much Nathaniel for tracking the culprit, someone else
reported a similar issue on the mailing list a few months ago but could not
find the root.

I think you should open an issue on the issue tracker, else this will get
lost in the archives of the mailing list.
Best regards,
Stephen L.

Le Jeu. 1 Fév. 2018 22:56, Nathaniel Beaver <nathanielmbeaver@xxxxxxxxx> a
écrit :

> When Zim starts up, switching pages is nice and snappy.
>
> https://www.youtube.com/watch?v=tl4_t6dW1l0
>
> After a while, loading even the smallest page has a noticeable delay.
>
> https://www.youtube.com/watch?v=euyGgEwLaIk
>
> Note the time required to display the new page in addition to the time
> before the cursor becomes active.
>
> This is running a recent Github version (commit
> e46a4b40a90d6903b6fa860bd72a4464dab7d704) with all add-ons disabled,
> including built-in add-ons. However, it was originally reproduced on
> older versions, so it is not a new effect.
>
> A few backtraces with GDB during a page switch shows most of the time is
> spent in this query from the "get_mytreeiter()" function in
> zim/notebook/index/pages.py:
>
>      for i, row in enumerate(self.db.execute('''
>          SELECT * FROM pages WHERE parent=?
>          ORDER BY sortkey, name LIMIT 20 OFFSET ?
>          ''',
>          (parent_id, offset)
>      )):
>
>
> https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/e46a4b40a90d6903b6fa860bd72a4464dab7d704/zim/notebook/index/pages.py#L804
>
> This is being called from the main loop.
>
>      Traceback (most recent call first):
>        File
>
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/notebook/index/pages.py",
> line 805, in get_mytreeiter
>          (parent_id, offset)
>        File
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py",
> line 224, in on_get_iter
>          return self.get_mytreeiter(treepath)
>        File
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py",
> line 238, in on_iter_next
>          return self.on_get_iter(treepath)
>        File
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
> line 703, in _run_main_loop
>          gtk.main()
>        File
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
> line 656, in _run_cmd
>          self._run_main_loop(cmd)
>        File
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
> line 626, in run
>          self._run_cmd(cmd, args) # test seam
>        File
> "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
> line 845, in main
>
> The query itself is fast, so it seems strange that so much time is spent
> here. However, running Zim with "cprofile" for about a day shows there
> are roughly 1.3e6 calls to the sqlite3 execute method, so even though
> the time per call is less than a millisecond, the cumulative time is
> considerable.
>
>      Wed Jan 17 10:33:24 2018    zim.notebook.index.pages_1516168791.prof
>
>               43403639 function calls (42725089 primitive calls) in
> 86784.644 seconds
>
>         Ordered by: internal time
>         List reduced from 2168 to 4 due to restriction <4>
>
>         ncalls  tottime  percall  cumtime  percall
> filename:lineno(function)
>              1 85453.595 85453.595 86765.186 86765.186 {gtk._gtk.main}
>        1347825 1154.685    0.001 1154.686    0.001 {method 'execute' of
> 'sqlite3.Connection' objects}
>        5138231   50.881    0.000 1207.454    0.000
>
> /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py:231(on_iter_next)
>             11   43.953    3.996   48.992    4.454
>
> /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/widgets.py:3152(run)
>
> By setting a breakpoint in GDB:
>
>      break _pysqlite_query_execute
>
> and then using "ignore" and "info break", we can count how many queries
> are run. In my notebook of about 1,400 pages, there are queries
> amounting to roughly twice that on startup. When the pages begin loading
> slowly, it runs correspondingly more queries.
>
> Since "get_mytreeiter()" caches the page on each call, is it possible
> that the cache is being duplicated or otherwise not working properly,
> resulting in an excessive number of queries?
>
> If the cache is working as intended, could it be changed so that it does
> not become progressively slower?
>
> Should I file a bug report on Github, or should I wait until there is
> something more concrete to submit?
>
>
> Nathaniel Beaver
>
> P.S. Upon request, I am happy to provide full data and details on
> collection methods.
>
> Please let me know if there is something I overlooked or a flaw in the
> methodology.
>
> This is a follow-up to my message from December 29, 2017 with subject
> "profiling zim for performance?"
>
> _______________________________________________
> Mailing list: https://launchpad.net/~zim-wiki
> Post to     : zim-wiki@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~zim-wiki
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References