larry-discuss team mailing list archive

Thread
Date

Re: New features: totuples, fromtuples

To: larry-discuss@xxxxxxxxxxxxxxxxxxx
From: josef.pktd@xxxxxxxxx
Date: Sun, 31 Jan 2010 15:56:15 -0500
In-reply-to: <f4f93d421001311244o7665dbc6w5b1b87ac3e389f0b@mail.gmail.com>

On Sun, Jan 31, 2010 at 3:44 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Sun, Jan 31, 2010 at 12:38 PM,  <josef.pktd@xxxxxxxxx> wrote:
>> On Sun, Jan 31, 2010 at 3:11 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Sun, Jan 31, 2010 at 12:05 PM,  <josef.pktd@xxxxxxxxx> wrote:
>>>> On Sun, Jan 31, 2010 at 2:57 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> Record as in numpy record array?
>>>>
>>>> kind of, as in a row in a structured array, not really a recordarray,
>>>> which add some candy that's not really worth the effort.
>>>> tabular is based on structured arrays (they moved away from record
>>>> arrays), and scikits timeseries torecords() produces structured arrays
>>>> not record arrays.
>>>
>>> I don't know the format for a structured array. The first google hit
>>> (scipy docs) says: "Structured Arrays (aka Record Arrays)"
>>>
>>> What would this look like in structured array format?
>>>
>>>>> y = la.larry([[1.0, 2.0], [3.0, 4.0]], [['a', 'b'], ['c', 'd']])
>>>>> y
>>> label_0
>>>    a
>>>    b
>>> label_1
>>>    c
>>>    d
>>> x
>>> array([[ 1.,  2.],
>>>       [ 3.,  4.]])
>>>
>>> Oops, we've gone off list.
>>>
>> back on list
>>
>> I think record arrays have gone a bit out of fashion in the last two
>> years when I was following the mailing lists. Most discussion on the
>> mailing list is on structured arrays, which have the same dtype
>> structure as record arrays, but without the
>> dotted access to columns
>>
>> here is an attempt not really general,
>>
>> y = la.larry([[1.0, 2.0], [3.0, 4.0]], [['a', 'b'], ['c', 'd']])
>> ysr = np.empty(y.x.shape[0],dtype=([('index','S1')]+[(i,np.float) for
>> i in y.label[1]]))
>> ysr['index'] = y.label[0]
>> for i in ysr.dtype.names[1:]:
>>    ysr[i] = y[y.labelindex(i, axis=1)].x
>>
>>
>>>>> ysr
>> array([('a', 1.0, 3.0), ('b', 2.0, 4.0)],
>>      dtype=[('index', '|S1'), ('c', '<f8'), ('d', '<f8')])
>>>>> ysr.shape
>> (2,)
>>>>> ysr[0]
>> ('a', 1.0, 3.0)
>>>>> ysr[1]
>> ('b', 2.0, 4.0)
>>
>> Adding the labels in the first column, makes it a bit more difficult,
>> otherwise it would just be a view on y.x with a structured dtype.
>>
>> What is the best way to access a larry column by label name?
>
> If you only want to pull one row then you can use:
>
>>> y.pull('a', 0)
>
> label_0
>    c
>    d
> x
> array([ 1.,  2.])
>
> I have experimental support (not in trunk) for indexing with label names. So
>
> y.index[['a']]
>
> or
>
> y.index[['a'],:]

Looks nice, access by label/index name as in pandas can be very
convenient, at least for variable names. I'm not used much to
accessing time periods by date.

Josef


>
> _______________________________________________
> Mailing list: https://launchpad.net/~larry-discuss
> Post to     : larry-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~larry-discuss
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

Re: New features: totuples, fromtuples
From: Keith Goodman, 2010-01-31

References

New features: totuples, fromtuples
From: josef . pktd, 2010-01-31
Re: New features: totuples, fromtuples
From: Keith Goodman, 2010-01-31
Re: New features: totuples, fromtuples
From: josef . pktd, 2010-01-31
Re: New features: totuples, fromtuples
From: Keith Goodman, 2010-01-31