linuxdcpp-team team mailing list archive
-
linuxdcpp-team team
-
Mailing list archive
-
Message #09284
[Bug 2110291] Re: One time small updates in the share may not trigger a Bloom filter update request which makes such updated files unsearchable by TTH for other hub users
The bloom filter does its thing pretty well if you make sure it is updated properly. This is what it is all about.
The old DCBase forum is offline atm (wtf) and archive.org does not archive it properly, it seems (wtf2) so all I can see there's a topic there on bloom having 'latency', presumably on the hashed files availability. But this is exactly what is addressed here.
If you understand why it's not updated properly then the fix is rather
trivial. There are multiple ways to resolve this in client side and I am
already testing one. It works afics, I'd just like to see at least one
more confirmation during a proper testing session with someone else
before committing.
The already committed update for the hub side is rather just a partial
remedy against clients that don't want to fix this issue, but a proper
client side fix works 100% even on hubs that do not include the hubside
fix at all.
With a client side fix applied, there's only up to 1 minute latency
(worst case) to hashed files become searchable by TTH, and for now this
applies after any type of share changes.
Generally, the solution is that you should send an SF value to the hub right after any share refresh and before any new or changed files have started hashing. With the SF value in that point the hub's bloom requester gets a reference and it can use this reference later to compare subsequent SF's to - I mean SF's that are to be sent later, as hashing progresses, with the minutely INF sending.
This approach guarantees that any type of updates in the share will be signaled to the bloom requester.
** Changed in: adchpp
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/2110291
Title:
One time small updates in the share may not trigger a Bloom filter
update request which makes such updated files unsearchable by TTH for
other hub users
Status in ADCH++:
Fix Released
Status in AirDC++:
New
Status in DC++:
Confirmed
Bug description:
There is a possible scenario where other users logged into the same ADCH++ hub with Bloom filter support
may not receive search results (by TTH) for one or more updated files after manually refreshing the share in DC++, until the user updates the share once more or reconnects to the hub.
The problem is consistently reproducible after one or a few files getting updated and the sharre refreshed,
if the overall size of the changed files is relatively small.
To reproduce this, you need to update already shared file(s) with different content,
or perform a similar number of file removals and additions to the share, then manually refresh the share.
The cause of the issue is that sending INFs — just like any other commands — is not instantaneous.
The function that compiles the INF command is placed into the async task queue of all connected hubs' sockets, to be run when feasible.
If, for example, you update one small file and refresh the share, normally that would result in sending SF = lastSF - 1 with the infoupdate() right after the refresh.
Then, the hashing thread's TTHDone event handler updates the total number of files after the file with the updated content has been hashed.
This change is then sent with the next scheduled infoupdate() (typically minutely).
But... if the small updated file is already hashed by the time the hub's respective infoupdate() is called,
then SF becomes lastSF + 1 again. Bingo — the value is correct, but the Bloom plugin won't be signaled to request a filter update.
OTOH if the hasher's queue is empty before the share refresh, it will indeed start working almost instantaneously, so if the total size of the updated file(s) is small enough, it often wins the race, it seems.
The largest total updated file size to reproduce this depends on your hardware.
It is higher with faster CPUs and storage, and also depends on how busy the hub/socket is at the time.
On a system with a 100Mb/s HDD read speed and an i5-6600 CPU, the threshold is about 15 MiB.
Obviously, this could easily be 10 times larger on modern hardware.
To manage notifications about this bug go to:
https://bugs.launchpad.net/adchpp/+bug/2110291/+subscriptions
References