← Back to team overview

launchpad-dev team mailing list archive

Re: Collation orders in Launchpad

 

On 3 June 2011 16:11, Jeroen Vermeulen <jtv@xxxxxxxxxxxxx> wrote:
> Hi all,
>
> We've had a merge proposal to make sorting of subscriber names
> case-insensitive:
>
> https://code.launchpad.net/~nigelbabu/launchpad/spec-sub-sort/+merge/63315
>
> This harks back to a problem we've run into before: IIRC we run Launchpad in
> the C locale, which sorts unpleasantly.  It may seem like a small thing if
> you're used to Latin script but it could actually be quite bad with other
> languages.  Any human-readable locale such as en_US.UTF-8 would fix the
> problem.
>
> As I recall, technical uncertainty stopped us from fixing this at the time.
>  It may have been because we were still running dapper back then, and its
> locale support differed too much from lucid's.
>
> Could we try again, or perhaps find some narrower way to improve
> user-visible sort ordering?

A few things:

Nigelb already landed some other changes.  I am so happy that he is
fixing these and apparently finding the process of contributing to
Launchpad worthwhile.  If he's planning to fix other similar bugs that
would be great but it'd be good to do it systematically.

Maybe as a simple place to start there should just be a "compare human
strings" function that can be passed to sort(cmp=) and at least the
.lower() will not be repeated.

Ideally the sort would be consistent with whatever psql does.  (Or is
it maybe case sensitive?)

Maybe we should actually use locale.strcoll, rather than comparing the
lowered forms?  <http://docs.python.org/library/locale.html>  istr
this is rather better on non-English names.  For en_AU.UTF-8  it is
case insensitive, though it is case sensitive in C.

Martin


Follow ups

References