← Back to team overview

unity-design team mailing list archive

Search algorithm in Ubuntu Software Center

 

Anup Verma wrote on 03/10/11 12:26:
>
Friends, don't you find any problem with the algorithm used to search
in the software center. Giving you an example:

Let us search for MultiGet. When I write "Mult" in the search MultiGet
appears at the 6th position. As soon as I add 'i', I see that
surprisingly, MultiGet now appears at 10th position with some rather
irrelevant options before it.

Not understandable...
...

These are the items that appear for "multi" before MultiGet:
*   Qtractor, "a multi-track sequencer"
*   qutIM, "a multi-protocol IM client"
*   Teeworlds, "an online multi-player platform 2D shooter"
*   BasKet, "a multi-purpose note-taking application for KDE"
*   ROXTerm, "Multi-tabbed GTK/VTE terminal emulator"
*   Babiloo, "dictionary viewer with multi-languages support"
*   Jokosher, "Simply and easily create multi-track audio"
*   Empathy, "GNOME multi-protocol chat and call client"
*   ACM, "a multi-player aerial combat simulation".

Though USC does take into account the possibility that the last word you typed is only part of a word, it gives more weight to results where that last word is a complete word. And it can't tell the difference between a complete word and a hyphenated prefix like "multi-".

All of those higher results use "multi-" in their summary or synopsis. MultiGet uses "multi-" only in its description.

So that's why more items appear above MultiGet for "multi" than for "mult": because as soon as you type the "i", the packages that use "multi-anything" start being more important than those which use "multianything". And the packages that use "multi-anything" in their title or summary are treated as more important than those that use "multi" in their name *or* "multi-" in their description.

If there is a bug here, it is that BasKet, ROXTerm, and Empathy use "multi-" words only in their package synopsis, which USC doesn't show at all -- so it's not obvious why they're being weighted as they are. I've reported that now. <http://launchpad.net/bugs/865294>

Thanks
--
mpt



References