← Back to team overview

linuxdcpp-team team mailing list archive

[Bug 2009492] Re: Certain type of changes in the share do not trigger a Bloom filter update which makes such changed files temporarily unsearchable

 

** Description changed:

- <eMTee> So not getting a result for a changed file (same path/different content) in the share after re-hashing is because the hub requesting a new bloom filter only if the number of shared files are changed in the INF coming from the client. In common examples like when you share an updated binary or change a text file and reindex this would not happen at all.
- <eMTee> Bloom request is only triggered by an SF and not SS in the INF. See https://sourceforge.net/p/adchpp/code/ci/default/tree/plugins/Bloom/src/BloomManager.cpp#L98 
- <eMTee> And with adding SS to the check there we're still not completly out of water since if the share change is a same path, same size, different content change then it still sucks. Minor editing of a text file or change of a fix-sized metadata e.g. an MP3 IDv1 tag resulting exactly this scenario.
- <eMTee> You can change even all of your share in this special way and if you don't change the sizes and number of files then you won't provide hits at all until you do some other kind of share change or reconnect the hub.
+ Update: rephrase and clarify the initial report.
  
- [2023-02-28 09:03] <eMTee> So Blom request is based on an inadequate signal that's not enough for all cases.
- [2023-02-28 09:07] <eMTee> SS also should be hooked on at the very least but a perfect solution would be something that is signalling the share change in general or the number of re-hashes in the current client session. Or the last rehash timestamp. These signals would be adequate for requesting a new Bloom filter in all cases when it is needed to.
- [2023-02-28 09:11] <eMTee> Of course the client could force to send an INF SF after all rehashes in case it supports Blom, but it's pretty ugly to implement in DC++ and, more importantly, it is against the protocol since you send INFs only if some values change and in these special cases we investigate this would mean sending multiple INF SF's with the same value.
- [2023-02-28 09:13] <eMTee> "Each time this is received, it means that the fields specified have been added or updated." in https://adc.sourceforge.io/ADC.html#_inf
- [2023-02-28 09:17] <eMTee> If an extension is allowed to specify new INF fields then a last rehash timestamp field would probably be the cleanest solution for this both protocol and implementation wise...
+ --------------
+ 
+ There is a problem of not getting a search result for any number of
+ changed files (same path/different content) in the share after re-
+ hashing in an ADC client connected to an ADC hub with Bloom filter
+ support of TTH searches.
+ 
+ The issue is because the hub requesting a new bloom filter only if the number of shared files are changed in the INF SF coming from the client. In common examples like when you share an updated binary or change a text file and reindex this would obviously not happen. For example changing of a fix-sized metadata e.g. an MP3 IDv1 tag resulting exactly this scenario.
+ So the filter request is based on an inadequate signal that's not enough for all common use cases.
+ 
+ A solution would be something that is signalling the share change in
+ general or also provided the number of re-hashes in the current client
+ session or maybe the last rehash timestamp. These signals would be
+ adequate for requesting a new Bloom filter in all cases when files
+ changed in a client's share.
+ 
+ Of course a BLOM supporting client could force to send an INF SF after
+ all re-hashes when there is a content change in the share but it is
+ against the protocol since INFs allowed to send only if any of the flag
+ values changed and in these special case this would mean sending
+ multiple INF SF's with the same SF value (see "Each time this is
+ received, it means that the fields specified have been added or
+ updated." in https://adc.sourceforge.io/ADC.html#_inf ).
+ 
+ If an extension is allowed to specify new INF fields then a new flag
+ ("SC"?)  optionally with parameters containing more data for the hub
+ about the actual share change, like a last rehash timestamp and number
+ of changed files. This would probably be the cleanest solution but it
+ needs a protocol update for the BLOM ADC extension.
  
  Within the currently defined standards another possibility is to do some
- client side trickery, an ugly hack to slightly fake SF or SS (eg. by
- incrementing one of them by 1) in each of this special share change case
- so then that triggers a Bloom update.
+ client side trickery, an ugly hack to slightly fake SF (eg. by
+ incrementing it by 1) in each of this special share change casees so
+ then that'd trigger a BLOM request for an updated filter.

** Summary changed:

- Certain type of changes in the share do not trigger a Bloom filter update which makes such changed files temporarily unsearchable
+ Certain type of changes in the share do not trigger a Bloom filter update which makes such changed files temporarily unsearchable by TTH

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/2009492

Title:
  Certain type of changes in the share do not trigger a Bloom filter
  update which makes such changed files temporarily unsearchable by TTH

Status in ADCH++:
  New
Status in DC++:
  Confirmed

Bug description:
  Update: rephrase and clarify the initial report.

  --------------

  There is a problem of not getting a search result for any number of
  changed files (same path/different content) in the share after re-
  hashing in an ADC client connected to an ADC hub with Bloom filter
  support of TTH searches.

  The issue is because the hub requesting a new bloom filter only if the number of shared files are changed in the INF SF coming from the client. In common examples like when you share an updated binary or change a text file and reindex this would obviously not happen. For example changing of a fix-sized metadata e.g. an MP3 IDv1 tag resulting exactly this scenario.
  So the filter request is based on an inadequate signal that's not enough for all common use cases.

  A solution would be something that is signalling the share change in
  general or also provided the number of re-hashes in the current client
  session or maybe the last rehash timestamp. These signals would be
  adequate for requesting a new Bloom filter in all cases when files
  changed in a client's share.

  Of course a BLOM supporting client could force to send an INF SF after
  all re-hashes when there is a content change in the share but it is
  against the protocol since INFs allowed to send only if any of the
  flag values changed and in these special case this would mean sending
  multiple INF SF's with the same SF value (see "Each time this is
  received, it means that the fields specified have been added or
  updated." in https://adc.sourceforge.io/ADC.html#_inf ).

  If an extension is allowed to specify new INF fields then a new flag
  ("SC"?)  optionally with parameters containing more data for the hub
  about the actual share change, like a last rehash timestamp and number
  of changed files. This would probably be the cleanest solution but it
  needs a protocol update for the BLOM ADC extension.

  Within the currently defined standards another possibility is to do
  some client side trickery, an ugly hack to slightly fake SF (eg. by
  incrementing it by 1) in each of this special share change casees so
  then that'd trigger a BLOM request for an updated filter.

To manage notifications about this bug go to:
https://bugs.launchpad.net/adchpp/+bug/2009492/+subscriptions



References