← Back to team overview

zim-wiki team mailing list archive

Re: Making indexing optional ?

 

I think I submitted my request circa 2014 under the previous bug tracking system - was it hosted by Ubuntu-one? - but yes, the idea is similar.

I just downloaded the development version, extracted it into a temporary folder, and ran it via the ./zim.py command.

Indexing took some 15 minutes. Below a snapshot of what top was saying about the execution.

top - 12:45:28 up 3 days, 16:12,  1 user,  load average: 1.87, 1.92, 2.48
Tasks: 356 total,   3 running, 353 sleeping,   0 stopped,   0 zombie
%Cpu(s): 13.0 us,  5.4 sy,  0.0 ni, 81.6 id,  0.0 wa,  0.0 hi, 0.0 si,  0.0 st
MiB Mem :  31658.1 total,    320.9 free,  19312.0 used,  12025.3 buff/cache
MiB Swap:    976.0 total,      0.0 free,    976.0 used.  10085.6 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
159310 mario_b+  20   0  771220  80184  43420 R 100.0   0.2 *14:42.13 zim.py*

Please let me know if there is more I can do.

Thank you,
mario

On 4/23/21 11:25 AM, Jaap Karssenberg wrote:
Yes that explains, those large files will have a big impact on the indexer.

You are referring to this issue: Make indexer ignore text files that are not zim pages · Issue #907 · zim-desktop-wiki/zim-desktop-wiki (github.com) <https://github.com/zim-desktop-wiki/zim-desktop-wiki/issues/907> which is fixed in the development branch and will be in the next release.

With that fix the indexer will read the first line of each file to decide whether it is a zim file or not, and if not it will not try to access the contents.

Would be great if you have a chance to test the development branch and see whether it works in practice for your case !

-- Jaap


On Thu, Apr 22, 2021 at 7:32 PM Mario Bezzi <subscriptions.mario.bezzi@xxxxxxxxx <mailto:subscriptions.mario.bezzi@xxxxxxxxx>> wrote:

    The folder contains 3118 ".txt" files, for a total of 2GB of data.
    Some large txt files are attachments. A long time ago I submitted
    a request to avoid indexing these. Not sure it has been fulfilled
    though.

    Thank you,
    mario

    On 4/8/21 7:32 PM, Jaap Karssenberg wrote:
    Can you indicate how big your notebook folder is? Either an
    extreme case, or some bug making it take much longer than needed.

    Op do 8 apr. 2021 15:59 schreef Mario Bezzi
    <subscriptions.mario.bezzi@xxxxxxxxx
    <mailto:subscriptions.mario.bezzi@xxxxxxxxx>>:

        Thanks Jaap, I was not aware of this.

        To give you an idea, I just restarted Zim, and indexing kept
        a processor 100% busy for 13 minutes to come to an end.  It
        was nice if this could be avoided.

        Thank you,
        mario

        On 4/8/21 10:06 AM, Jaap Karssenberg wrote:
        The indexing is not used for searching alone, it is also
        needed to e.g. present the page tree in the side pane and to
        track links

        Op do 8 apr. 2021 09:34 schreef Mario Bezzi
        <subscriptions.mario.bezzi@xxxxxxxxx
        <mailto:subscriptions.mario.bezzi@xxxxxxxxx>>:

            Hello,

            I may be the only one, but with my quite large notebooks
            I do find the
            search function impractical, and for this reason I never
            use it. Still,
            when it starts, Zim goes crazy for a long time indexing,
            and I came to
            the conclusion that this is normal.

            If this is the case, I would like to file a requirement
            to add the
            ability to make indexing optional.

            Thank you,
            mario

            _______________________________________________
            Mailing list: https://launchpad.net/~zim-wiki
            <https://launchpad.net/~zim-wiki>
            Post to     : zim-wiki@xxxxxxxxxxxxxxxxxxx
            <mailto:zim-wiki@xxxxxxxxxxxxxxxxxxx>
            Unsubscribe : https://launchpad.net/~zim-wiki
            <https://launchpad.net/~zim-wiki>
            More help   : https://help.launchpad.net/ListHelp
            <https://help.launchpad.net/ListHelp>





Follow ups

References