← Back to team overview

launchpad-dev team mailing list archive

APIs and len() of collections

 

I know that there is some work underway at the moment to defer the
point where we call len() on collections. I'd like feedback on an even
more ambitious proposal:

 - not calling len() ever

I need input here - where do people use len(), why do they use len(),
what would the impact of nuking it be? We need this input to build
better interfaces - ones that scale and perform well.

Some inputs that lead me to proposing this goal:
 - len() is a precise interface

 - highly precise counting is extremely expensive.

 - the results of such counting are also stale almost immediately:
API's query in separate transactions each time

 - its not useful for users [200000 open bugs vs 200001 is a
near-valueless distinction]

I think that in an ideal world we'd just remove the facility in devel;
I'm sure there are places where something-like-it will be /needed/,
but I propose that we should make that be a rare exception, not the
common case.

A related issue is pagination in API's, which really doesn't make
sense, I'll pull on that separately if possible though.

So far, I've thought of two replacement interfaces:
 - estimate_size(collection) => {0..99, hundreds, thousands, millions...}
   This would be used for providing UI feedback on collections

 - closed_since|changed_since parameters on various searches, so that
the use of len() to generate trend lines is able to be done - we can
precisely identify recent work without precisely identifying total
unfiltered collection size.

What do you think?

-Rob



Follow ups