← Back to team overview

larry-discuss team mailing list archive

Re: groupmean reduce

 

On Mon, May 3, 2010 at 8:30 PM,  <josef.pktd@xxxxxxxxx> wrote:
> On Mon, May 3, 2010 at 8:01 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>> On Mon, May 3, 2010 at 1:13 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>> Here is a simple implementation of a reduce option in groupmean,
>>> essentially it is two functions in one.
>>>
>>> see https://blueprints.launchpad.net/larry/+spec/group-method-design
>>> as a standalone function it could also be plugged into other larry
>>> methods, e.g. larry.mean
>>>
>>> Only tested on the example in the file.
>>>
>>> Josef
>>
>> A reduce option would be very handy. And it's very handy to have your
>> implementation to get a feel for how it would work. Thank you.
>>
>> BTW, what do you think of a weight input to the group-like functions?
>> It could be used, for example, to calculated a weighted group mean.
>> The weight could be 1d or have the same number of dimensions as the
>> input array.
>
> just to clarify
> How would you interpret and use the weights?
> So, for example, weights are firm sizes, then you want size weighted
> averages for each sector.
>
> It would also need a weights option in nanmean.
>
> group_mean and nanmean would be useful with weights, but I don't know
> what a weighted group_ranking would mean. group_median: would it be
> the 50th percentile (in terms of weights or like a distribution)?

Good point.

I'm not sure what to do with the group methods. At the moment
group_mean does not reduce which I think would be surprising to most
people. So I guess one way to go would be to make a break in la 0.3
and set reduce to True by default. That would work for reduce type
functions like mean, sum, max. But non-reducing functions like zscore,
ranking, demean do not fit the pattern. So that puts me back to
setting reduce to False by default.

reduce=True and reduce=False are two very different ideas. reduce=True
returns a larry with group labels; reduce=False returns a larry with
whatever labels it originally had.

> There is also an attachment to a scipy.stats trac ticket that does
> describtive statistics with weights and nan-handling.
>
> Josef
>
> Josef
>



Follow ups

References