← Back to team overview

launchpad-dev team mailing list archive

Re: APIs and len() of collections

 

On 26 July 2010 19:59, Robert Collins <robert.collins@xxxxxxxxxxxxx> wrote:
> On Mon, Jul 26, 2010 at 11:53 AM, Jonathan Lange <jml@xxxxxxxxxxxxx> wrote:
>> On Sat, Jul 24, 2010 at 10:08 AM, Robert Collins
>> <robert.collins@xxxxxxxxxxxxx> wrote:
>> ...
>>> I need input here - where do people use len(), why do they use len(),
>>> what would the impact of nuking it be? We need this input to build
>>> better interfaces - ones that scale and perform well.
>>>
>>
>> As a webservice user, I use len() on collections to get an accurate
>> count of the number of things I care about, so I can plot them.
>>
>> In fact, just the other day someone asked me to make a public burndown
>> chart of the number of oops & timeout bugs in Launchpad. :P
>
> I think we can do something that will preserve your ability to do this
> broadly but not impose as heavily on us.
>
> A few initial thoughts:
>  - you're graphing absolute figures, you could get a delta and sum an
> arbitrary start point. E.g. if we said when you start the exercise
> 'hundreds of oops bugs' and then called you back with each bug that
> becomes an oops bug (or stops being one), you could track that
> hundreds down and requery the aggregate estimate whenever you like.
>  - or we could provide a 'delta aggregate' which would be pretty
> accurate (e.g. +4 today) even with the same precision limits the
> overall aggregates have
>  - we could issue an API key for users we're willing to serve exact figures to.
>  - we could provide a dedicated get-exact-figures interface, so that
> people really have to choose it - and further to that we can document
> our preferred interfaces.

If you accept my theory that API clients want three kinds of semantic
operation ("count them", "give me all", "give me a few") then it seems
reasonable to have a specific clear network operation that means
"count them."  It shouldn't be accidentally called as a side effect of
eg getting or iterating over a list of objects, but it shouldn't
require you to jump through hoops either: only when the client
specifically does want to know the number.  We can then look at how
many people use this interface and how much work it takes us to serve
it or how often it times out, and if it's too high we can either
throttle it, or provide a cheaper or approximate interface.

istm most of the counting is done at the moment for clients that don't
actually care about the count and we can avoid that before thinking
about blocking clients that do want it, or giving them a more
complicated api.

-- 
Martin



Follow ups

References