linuxdcpp-team team mailing list archive
-
linuxdcpp-team team
-
Mailing list archive
-
Message #09229
[Bug 919424] Re: /rebuild does not update HashData
It's been a long time but it looks like now finally I've been able to
figure out this problem.
Regarding the original report, the part of the complain about
hashdata.dat is invalid. It is rebuilding correctly but the initial size
of the binary file is 1 MiB so it'll never shrink below that. With
hashdata file larger than that it works as expected.
The reasons why rebuilding is more effective after a restart is greatly
explained by maksis and restart is still the best practice since when
shared files getting removed, their hash information are not removed
from the internal memory map containing the hash indexes. At restart the
memory map is freshly synced with what has been found in the filesystem
so then checking for what items used and what are obsolete is much more
effective at that point.
Therefore if you remove files from the share and rehash and do a rebuild
then, unless you have obsolete data of removed items in previous
sessions, a rebuild operation will not make your hashindex or hashdata
file any slimmer.
Regarding what is removed from hashindex and what isn't, david.son is
right, as well as his recommended logic of the solution. Currently only
items with an unshared TTH are getting removed.
The puzzling thing is that the code responsible for this is pretty
logical and would easily allow to do what is expected by david.son.
The first version of rebuild code that is actually doing something with hashindex (and not just with hashdata) is added in https://bazaar.launchpad.net/~dcplusplus-team/dcplusplus/trunk/revision/545
It has been refactored and simplified several times since but they have never added a feature we miss here - even though it'd have been a very small and logically fit change...
Why? I'm not sure. Maybe a simple overlook or rather, back in the days,
the exceptionally talented people who created and shaped DC++ thought it
is better to have a bit larger index file with items kept for possible
reuse than possible re-hashings. This might have been pretty logical 20
years ago thinking about CPU and storage speeds and capacities of
consumer computers of the time. An average share consisted of a few
thousand files back then...
I'll test the fix on larger shares and most probably add to the next
version of DC++. Testers are welcome if there's still anyone who
cares...
** Summary changed:
- /rebuild does not update HashData
+ /rebuild does not remove all obsolete items from HashIndex
--
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/919424
Title:
/rebuild does not remove all obsolete items from HashIndex
Status in ApexDC++:
New
Status in DC++:
Confirmed
Bug description:
BCDC 0.790a.
Windows 7 SP1 x64.
Summary:
If you add a new share, hash the files, and then remove it, issue a /rebuild and then re-add the same share again, DC++ will not rehash the files. Furthermore, a /rebuild between adding/removing shares does not change the size of hashdata.dat or hasindex.xml.
Repro:
* add a new share. accept the default name. DC++ hashes the files.
* open your filelist to validate share is present.
* check the size of hashdata and hashindex.
* remove the share you just added.
* issue a /refresh and /rebuild.
* note the size of hashdata and hashindex has NOT changed.
* add the same share again, accepting the default name. DC++ does NOT hash the files.
* issue a /refresh. DC++ does not re-index the files from that share.
To manage notifications about this bug go to:
https://bugs.launchpad.net/apexdc/+bug/919424/+subscriptions
References