← Back to team overview

zim-wiki team mailing list archive

Re: caching and progressively slower page loads

 

Sounds good, I just opened an issue here:

https://github.com/jaap-karssenberg/zim-desktop-wiki/issues/311

Best,

Nathaniel

On 02/02/2018 06:25 AM, Stephen LARROQUE wrote:
Thank you very much Nathaniel for tracking the culprit, someone else reported a similar issue on the mailing list a few months ago but could not find the root.

I think you should open an issue on the issue tracker, else this will get lost in the archives of the mailing list.
Best regards,
Stephen L.


Le Jeu. 1 Fév. 2018 22:56, Nathaniel Beaver <nathanielmbeaver@xxxxxxxxx <mailto:nathanielmbeaver@xxxxxxxxx>> a écrit :

    When Zim starts up, switching pages is nice and snappy.

    https://www.youtube.com/watch?v=tl4_t6dW1l0

    After a while, loading even the smallest page has a noticeable delay.

    https://www.youtube.com/watch?v=euyGgEwLaIk

    Note the time required to display the new page in addition to the time
    before the cursor becomes active.

    This is running a recent Github version (commit
    e46a4b40a90d6903b6fa860bd72a4464dab7d704) with all add-ons disabled,
    including built-in add-ons. However, it was originally reproduced on
    older versions, so it is not a new effect.

    A few backtraces with GDB during a page switch shows most of the time is
    spent in this query from the "get_mytreeiter()" function in
    zim/notebook/index/pages.py:

          for i, row in enumerate(self.db.execute('''
              SELECT * FROM pages WHERE parent=?
              ORDER BY sortkey, name LIMIT 20 OFFSET ?
              ''',
              (parent_id, offset)
          )):

    https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/e46a4b40a90d6903b6fa860bd72a4464dab7d704/zim/notebook/index/pages.py#L804

    This is being called from the main loop.

          Traceback (most recent call first):
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/notebook/index/pages.py",
    line 805, in get_mytreeiter
              (parent_id, offset)
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py",
    line 224, in on_get_iter
              return self.get_mytreeiter(treepath)
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py",
    line 238, in on_iter_next
              return self.on_get_iter(treepath)
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
    line 703, in _run_main_loop
              gtk.main()
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
    line 656, in _run_cmd
              self._run_main_loop(cmd)
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
    line 626, in run
              self._run_cmd(cmd, args) # test seam
            File
    "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py",
    line 845, in main

    The query itself is fast, so it seems strange that so much time is spent
    here. However, running Zim with "cprofile" for about a day shows there
    are roughly 1.3e6 calls to the sqlite3 execute method, so even though
    the time per call is less than a millisecond, the cumulative time is
    considerable.

     Wed Jan 17 10:33:24 2018 zim.notebook.index.pages_1516168791.prof

                   43403639 function calls (42725089 primitive calls) in
    86784.644 seconds

             Ordered by: internal time
             List reduced from 2168 to 4 due to restriction <4>

             ncalls  tottime  percall  cumtime  percall
    filename:lineno(function)
                  1 85453.595 85453.595 86765.186 86765.186 {gtk._gtk.main}
            1347825 1154.685    0.001 1154.686    0.001 {method 'execute' of
    'sqlite3.Connection' objects}
            5138231   50.881    0.000 1207.454    0.000
    /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py:231(on_iter_next)
                 11   43.953    3.996   48.992    4.454
    /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/widgets.py:3152(run)

    By setting a breakpoint in GDB:

          break _pysqlite_query_execute

    and then using "ignore" and "info break", we can count how many queries
    are run. In my notebook of about 1,400 pages, there are queries
    amounting to roughly twice that on startup. When the pages begin loading
    slowly, it runs correspondingly more queries.

    Since "get_mytreeiter()" caches the page on each call, is it possible
    that the cache is being duplicated or otherwise not working properly,
    resulting in an excessive number of queries?

    If the cache is working as intended, could it be changed so that it does
    not become progressively slower?

    Should I file a bug report on Github, or should I wait until there is
    something more concrete to submit?


    Nathaniel Beaver

    P.S. Upon request, I am happy to provide full data and details on
    collection methods.

    Please let me know if there is something I overlooked or a flaw in the
    methodology.

    This is a follow-up to my message from December 29, 2017 with subject
    "profiling zim for performance?"

    _______________________________________________
    Mailing list: https://launchpad.net/~zim-wiki
    <https://launchpad.net/%7Ezim-wiki>
    Post to     : zim-wiki@xxxxxxxxxxxxxxxxxxx
    <mailto:zim-wiki@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~zim-wiki
    <https://launchpad.net/%7Ezim-wiki>
    More help   : https://help.launchpad.net/ListHelp



_______________________________________________
Mailing list: https://launchpad.net/~zim-wiki
Post to     : zim-wiki@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~zim-wiki
More help   : https://help.launchpad.net/ListHelp



References