larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00006
Re: New features: totuples, fromtuples
On Sun, Jan 31, 2010 at 3:44 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> On Sun, Jan 31, 2010 at 12:38 PM, <josef.pktd@xxxxxxxxx> wrote:
>> On Sun, Jan 31, 2010 at 3:11 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>> On Sun, Jan 31, 2010 at 12:05 PM, <josef.pktd@xxxxxxxxx> wrote:
>>>> On Sun, Jan 31, 2010 at 2:57 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
>>>>> Record as in numpy record array?
>>>>
>>>> kind of, as in a row in a structured array, not really a recordarray,
>>>> which add some candy that's not really worth the effort.
>>>> tabular is based on structured arrays (they moved away from record
>>>> arrays), and scikits timeseries torecords() produces structured arrays
>>>> not record arrays.
>>>
>>> I don't know the format for a structured array. The first google hit
>>> (scipy docs) says: "Structured Arrays (aka Record Arrays)"
>>>
>>> What would this look like in structured array format?
>>>
>>>>> y = la.larry([[1.0, 2.0], [3.0, 4.0]], [['a', 'b'], ['c', 'd']])
>>>>> y
>>> label_0
>>> a
>>> b
>>> label_1
>>> c
>>> d
>>> x
>>> array([[ 1., 2.],
>>> [ 3., 4.]])
>>>
>>> Oops, we've gone off list.
>>>
>> back on list
>>
>> I think record arrays have gone a bit out of fashion in the last two
>> years when I was following the mailing lists. Most discussion on the
>> mailing list is on structured arrays, which have the same dtype
>> structure as record arrays, but without the
>> dotted access to columns
>>
>> here is an attempt not really general,
>>
>> y = la.larry([[1.0, 2.0], [3.0, 4.0]], [['a', 'b'], ['c', 'd']])
>> ysr = np.empty(y.x.shape[0],dtype=([('index','S1')]+[(i,np.float) for
>> i in y.label[1]]))
>> ysr['index'] = y.label[0]
>> for i in ysr.dtype.names[1:]:
>> ysr[i] = y[y.labelindex(i, axis=1)].x
>>
>>
>>>>> ysr
>> array([('a', 1.0, 3.0), ('b', 2.0, 4.0)],
>> dtype=[('index', '|S1'), ('c', '<f8'), ('d', '<f8')])
>>>>> ysr.shape
>> (2,)
>>>>> ysr[0]
>> ('a', 1.0, 3.0)
>>>>> ysr[1]
>> ('b', 2.0, 4.0)
>>
>> Adding the labels in the first column, makes it a bit more difficult,
>> otherwise it would just be a view on y.x with a structured dtype.
>>
>> What is the best way to access a larry column by label name?
>
> If you only want to pull one row then you can use:
>
>>> y.pull('a', 0)
>
> label_0
> c
> d
> x
> array([ 1., 2.])
>
> I have experimental support (not in trunk) for indexing with label names. So
>
> y.index[['a']]
>
> or
>
> y.index[['a'],:]
Looks nice, access by label/index name as in pandas can be very
convenient, at least for variable names. I'm not used much to
accessing time periods by date.
Josef
>
> _______________________________________________
> Mailing list: https://launchpad.net/~larry-discuss
> Post to : larry-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~larry-discuss
> More help : https://help.launchpad.net/ListHelp
>
Follow ups
References