← Back to team overview

zim-wiki team mailing list archive

Re: Making indexing optional ?

 

Hi Mario,

That is not the result I hoped for :(   I will need to generate some random
large text files to test & debug on my end.

Regards,

Jaap


On Fri, Apr 23, 2021 at 12:59 PM Mario Bezzi <
subscriptions.mario.bezzi@xxxxxxxxx> wrote:

> I think I submitted my request circa 2014 under the previous bug tracking
> system - was it hosted by Ubuntu-one? - but yes, the idea is similar.
>
> I just downloaded the development version, extracted it into a temporary
> folder, and ran it via the ./zim.py command.
>
> Indexing took some 15 minutes. Below a snapshot of what top was saying
> about the execution.
>
> top - 12:45:28 up 3 days, 16:12,  1 user,  load average: 1.87, 1.92, 2.48
> Tasks: 356 total,   3 running, 353 sleeping,   0 stopped,   0 zombie
> %Cpu(s): 13.0 us,  5.4 sy,  0.0 ni, 81.6 id,  0.0 wa,  0.0 hi,  0.0 si,
> 0.0 st
> MiB Mem :  31658.1 total,    320.9 free,  19312.0 used,  12025.3 buff/cache
> MiB Swap:    976.0 total,      0.0 free,    976.0 used.  10085.6 avail Mem
>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>
>  159310 mario_b+  20   0  771220  80184  43420 R 100.0   0.2  *14:42.13
> zim.py*
>
> Please let me know if there is more I can do.
>
> Thank you,
> mario
>
> On 4/23/21 11:25 AM, Jaap Karssenberg wrote:
>
> Yes that explains, those large files will have a big impact on the
> indexer.
>
> You are referring to this issue: Make indexer ignore text files that are
> not zim pages · Issue #907 · zim-desktop-wiki/zim-desktop-wiki (github.com)
> <https://github.com/zim-desktop-wiki/zim-desktop-wiki/issues/907> which
> is fixed in the development branch and will be in the next release.
>
> With that fix the indexer will read the first line of each file to decide
> whether it is a zim file or not, and if not it will not try to access the
> contents.
>
> Would be great if you have a chance to test the development branch and see
> whether it works in practice for your case !
>
> -- Jaap
>
>
> On Thu, Apr 22, 2021 at 7:32 PM Mario Bezzi <
> subscriptions.mario.bezzi@xxxxxxxxx> wrote:
>
>> The folder contains 3118 ".txt" files, for a total of 2GB of data. Some
>> large txt files are attachments. A long time ago I submitted a request to
>> avoid indexing these. Not sure it has been fulfilled though.
>>
>> Thank you,
>> mario
>>
>> On 4/8/21 7:32 PM, Jaap Karssenberg wrote:
>>
>> Can you indicate how big your notebook folder is? Either an extreme case,
>> or some bug making it take much longer than needed.
>>
>> Op do 8 apr. 2021 15:59 schreef Mario Bezzi <
>> subscriptions.mario.bezzi@xxxxxxxxx>:
>>
>>> Thanks Jaap, I was not aware of this.
>>>
>>> To give you an idea, I just restarted Zim, and indexing kept a processor
>>> 100% busy for 13 minutes to come to an end.  It was nice if this could be
>>> avoided.
>>>
>>> Thank you,
>>> mario
>>>
>>> On 4/8/21 10:06 AM, Jaap Karssenberg wrote:
>>>
>>> The indexing is not used for searching alone, it is also needed to e.g.
>>> present the page tree in the side pane and to track links
>>>
>>> Op do 8 apr. 2021 09:34 schreef Mario Bezzi <
>>> subscriptions.mario.bezzi@xxxxxxxxx>:
>>>
>>>> Hello,
>>>>
>>>> I may be the only one, but with my quite large notebooks I do find the
>>>> search function impractical, and for this reason I never use it. Still,
>>>> when it starts, Zim goes crazy for a long time indexing, and I came to
>>>> the conclusion that this is normal.
>>>>
>>>> If this is the case, I would like to file a requirement to add the
>>>> ability to make indexing optional.
>>>>
>>>> Thank you,
>>>> mario
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~zim-wiki
>>>> Post to     : zim-wiki@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~zim-wiki
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>
>>>
>>
>

Follow ups

References