← Back to team overview

u1db-discuss team mailing list archive

Re: Index collation - was: Indexing and lists

 

On 11/18/2011 09:32 AM, John Arbash Meinel wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

A question - how are the indexes collated? Ie. does "Élèphant"
sort before or after "Elephant"? With the current locale of the
executing user? That could turn out "interesting" if I have two
boxes one in English and one in Danish...

This seems like a tricky problem unless the collation must be
specified at index creation time.

Cheers, Mikkel

ATM, querying an index does *not* sort the returned documents. So
applications are free to sort them however they like.


That's what I mean. An index, unless it's a hash index (which sqlite doesn't even supprt), have an intrinsic ordering in which results will be returned.

And if client side sorting is enforced there's really no point in not just loading the whole dataset on doing the querying client side as well. You'll need the whole dataset anyway to get sorting correct since requesting the first 10 docs for term T on index I and then sorting those docs is unlikely to be correct because it might well be doc11 or docN>10 that should go in the first position.

Cheers,
Mikkel


References