larry-discuss team mailing list archive
-
larry-discuss team
-
Mailing list archive
-
Message #00150
Re: Bootstrap and cross validation iterators
On Sat, May 22, 2010 at 8:21 PM, Keith Goodman <kwgoodman@xxxxxxxxx> wrote:
> I added a resample module to the la package (la.util.resample). It
> currently contains bootstrap and k-fold cross validation iterators.
> The index iterators are not specific to larrys; they return lists of
> indices. Just thought I'd mention it since they can be used most
> anywhere. Lots of unit tests are in la.util.tests.resample_test.py.
>
> You can optionally set the state of your random number generator
> outside of the index iterators and pass in shuffle to cv and randint
> to boot.
>
> K-fold cross validation indices for 5 elements and 3 folds:
>
> >>> from la.util.resample import cv
> >>> for train, test in cv(5,2):
> ... print
> ... print 'train: ', train
> ... print 'test: ', test
> ...
>
> train: [4, 3, 1]
> test: [0, 2]
>
> train: [0, 2]
> test: [4, 3, 1]
>
> Three bootstrap samples taken with replacement from four elements:
>
> >>> from la.util.resample import boot
> >>> for train, test in boot(4, 3):
> ... print
> ... print 'train: ', train
> ... print 'test: ', test
> ...
>
> train: [2 1 3 1]
> test: [0]
>
> train: [1 1 2 1]
> test: [0, 3]
>
> train: [1 3 0 0]
> test: [2]
>
> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head:/la/util/resample.py
> http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head:/la/util/tests/resample_test.py
2 design question
why did you choose to use a fixed seed by default? (I'm not completely
sure how using RandomState directly works, I usually just use
random.seed)
In some early leave one out loops, I also used indices to select. The
scikits.learn cross_val iterators use boolean index arrays. Do you
have any idea whether integer or boolean indices are faster?
Does boot work if nboot=n (no testsample) ?
I find the function names, especially cv (crossval_random_kfold?), a
bit too short and unspecific.
I think we will have more design questions, when we start to use this
(or similar) more systematically than just some eclectic examples of
bootstrap as we have until now.
Josef
> _______________________________________________
> Mailing list: https://launchpad.net/~larry-discuss
> Post to : larry-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~larry-discuss
> More help : https://help.launchpad.net/ListHelp
>
Follow ups
References