← Back to team overview

larry-discuss team mailing list archive

Bootstrap and cross validation iterators

 

I added a resample module to the la package (la.util.resample). It
currently contains bootstrap and k-fold cross validation iterators.
The index iterators are not specific to larrys; they return lists of
indices. Just thought I'd mention it since they can be used most
anywhere. Lots of unit tests are in la.util.tests.resample_test.py.

You can optionally set the state of your random number generator
outside of the index iterators and pass in shuffle to cv and randint
to boot.

K-fold cross validation indices for 5 elements and 3 folds:

    >>> from la.util.resample import cv
    >>> for train, test in cv(5,2):
    ...     print
    ...     print 'train: ', train
    ...     print 'test:  ', test
    ...

    train:  [4, 3, 1]
    test:   [0, 2]

    train:  [0, 2]
    test:   [4, 3, 1]

Three bootstrap samples taken with replacement from four elements:

    >>> from la.util.resample import boot
    >>> for train, test in boot(4, 3):
    ...     print
    ...     print 'train: ', train
    ...     print 'test:  ', test
    ...

    train:  [2 1 3 1]
    test:   [0]

    train:  [1 1 2 1]
    test:   [0, 3]

    train:  [1 3 0 0]
    test:   [2]

http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head:/la/util/resample.py
http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head:/la/util/tests/resample_test.py



Follow ups