← Back to team overview

u1db-discuss team mailing list archive

Re: Index collation - was: Indexing and lists

 

On 11/18/2011 09:32 AM, John Arbash Meinel wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

A question - how are the indexes collated? Ie. does "Élèphant"
sort before or after "Elephant"? With the current locale of the
executing user? That could turn out "interesting" if I have two
boxes one in English and one in Danish...

This seems like a tricky problem unless the collation must be
specified at index creation time.

Cheers, Mikkel

ATM, querying an index does *not* sort the returned documents. So
applications are free to sort them however they like.

That's what I mean. An index, unless it's a hash index (which sqlite 
doesn't even supprt), have an intrinsic ordering in which results will 
be returned.
And if client side sorting is enforced there's really no point in not 
just loading the whole dataset on doing the querying client side as 
well. You'll need the whole dataset anyway to get sorting correct since 
requesting the first 10 docs for term T on index I and then sorting 
those docs is unlikely to be correct because it might well be doc11 or 
docN>10 that should go in the first position.
Cheers,
Mikkel


References