software-store-developers team mailing list archive

Thread
Date

Re: top rated - data analysis

To: Aaron Peachey <alpeachey@xxxxxxxxx>
From: Michael Vogt <mvo@xxxxxxxxxx>
Date: Mon, 25 Jul 2011 10:07:45 +0200
Cc: software-store-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CAEXdADw9=imUB-nDK7MtuHU6fuwF+WL8+e1e_PoNg7vxMu50fw@mail.gmail.com>
User-agent: Mutt/1.5.21 (2010-09-15)

On Sat, Jul 23, 2011 at 11:09:39PM +1000, Aaron Peachey wrote:
> Hi all,
Hi,

> You may have seen the top-rated has been partially implemented (thanks
> mvo) with a new carousel on the home screen. It may not quite be
> working properly yet and otherwise still needs some work, but in the
> mean time now that the server has histograms for all packages and this
> means we now have a dampened rating for them all too, we thought it a
> good time to take a look at the data to verify the usefulness of the
> algorithm and I have attached a spreadsheet file with the rankings as
> at today (Sat 23/7/11 Australian GMT+10).

Nice, thanks for providing us with this data!

> It has had an interesting impact, as you can see in the spreadsheet,
> by comparing each package's overall ranking against other packages
> based on the dampened rating to the same ranking if we were using
> average rating. I would definitely say that this algorithm seems to
> provide a fairer approach to determining the 'top rated' apps but it
> is interesting which apps fit into our top 12 (gparted at #1 is not
> exactly what i was expecting!).

When I looked at the top rated ones, I was pretty suprised as
well. gparted? I like it too, but our #1 app? Like Matthew Paul Thomas
already pointed out this will probably rebalance once this feature
becomes more widely available.

A for the algorithm, I agree with you I think the approach is indeed
much better than a simple average. One thing I'm noticing is that we
maybe not give enought weight to the total ratings,
e.g. cairo-dock-core with "just" 16 ratings or gelemntal with just 11
are pretty high, whereas "clementine", "thunderbird", "audacity" or
"gimp" all have much more people rating and the ratings are pretty
good still. I played a bit with the algorithm and added a
utils/show_top_rated_for_various_powers.py to trunk. It seems a simple
tweak towards power=0.1 or 0.05 helps. The disadvantage of this is of
course that new apps have a harder time to float on top.

> The other key is that the additional decimal precision of dampened
> rating allows us to stop the plethora of apps with only 5 star ratings
> from 'clogging' the top end without any differentiation between them
> (except for alpha sort by package name), which they currently do if we
> use average rating since all those with only votes of 5 immediately
> float to the top and are unable to be differentiated from one another.

Yeah, that is another nice benefit!

Cheers,
 Michael

References

top rated - data analysis
From: Aaron Peachey, 2011-07-23