maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #05211
Re: my_hash_sort_bin
Hi, Mark!
On Feb 19, MARK CALLAGHAN wrote:
>
> we realized that the hash function used in my_hash_sort_bin is lousy for
> this input: test.sbest1, test.sbtest2, ..., test.sbtest10. The problem is
> made worse when a small number of hash buckets is used because the hash
> function output doesn't do the right thing for the least significant bits
> so that all 10 inputs map to the same hash bucket. More details are at
> http://bugs.mysql.com/bug.php?id=66473
>
> The InnoDB hash function is much better. Details for that and a test
> program are in the bug report. Does anyone remember why this hash function
> was chosen?
>
> strings/ctype-bin.c doesn't have any comments explaining why this hash
> function was selected. This is another peeve for me. Critical code like
> this should be explained if we expect anyone new to begin working on this
> code.
This is mysql hash function from the dawn of time. Before charset code
it existed in the mysys/hash.c and I found it unchanged in as early as
mysql-3.20.13 (it's 1997).
Regards,
Sergei
References