← Back to team overview

launchpad-dev team mailing list archive

Re: brainstorm: cheaper API collection iteration

 

My understanding, may be way off.

Pages in collections are just links, so you can generate what you
like. You can generate the bare link to the collection to get the first
page, and then when that page is returned it would say

   next_page_link:
   "https://api.launchpad.net/devel/some_collection?start_key=endkey_of_this_collection

then when the next page is needed launchpadlib simply hits that URL.

Therefore I think this would be fairly straightforward to change in the
current webservice. It also looks as though just changing
lazr.batchnavigator will get most of the way there, so it may be that
fixing the web UI fixes the API too.
In general, this is right. lazr.restfulclient follows the URL it finds 
in next_collection_link without caring what that URL is. launchpadlib 
doesn't mess with the collection links at all, and lazr.restfulclient 
never uses previous_collection_link. So if you come up with a better way 
to paginate records, you can just change next_collection_link and the 
client will adapt.
I know of two exceptions, both optimizations in the code that handles 
slices (_get_slice() in lazr.restfulclient).
1. If you ask for a slice like launchpad.bugs[:76], lazr.restful gets 
the first page (which happens to have 75 entries), and then this code runs:
            if more_needed > 0 and more_needed < first_page_size:
                # An optimization: it's likely that we need less than
                # a full page of entries, because the number we need
                # is less than the size of the first page we got.
                # Instead of requesting a full-sized page, we'll
                # request only the number of entries we think we'll
                # need. If we're wrong, there's no problem; we'll just
                # keep looping.
                page_url = self._with_url_query_variable_set(
                    page_url, 'ws.size', more_needed)

So, we take next_collection_link, and we hack ws.size to only give us (in this case) one additional entry. We don't need another 75 entries, we only need one.
I think this will continue to work no matter what next_collection_link 
looks like, so long as ws.size  continues to work. Worst case, we can 
simply remove the optimization.
2. When you ask for a slice like 'launchpad.bugs[70:200]', this code runs:

            # No part of this collection has been loaded yet, or the
            # slice starts beyond the part that has been loaded. We'll
            # use our secret knowledge of lazr.restful to set a value for
            # the ws.start variable. That way we start reading entries
            # from the first one we want.
            first_page_size = None
            entry_dicts = []
            page_url = self._with_url_query_variable_set(
                self._wadl_resource.url, 'ws.start', start)

This "secret knowledge of lazr.restful" would be invalidated by the change. We could stop supporting this syntax in later versions of launchpadlib, or we could just remove the optimization: make lazr.restfulclient load subsequent pages from the beginning until it has 200, and then perform the slice.
I did a quick search of Launchpad for references to ws.start and 
ws.size. It shows up a lot in tests, but that's about it. In particular, 
it doesn't seem to be used in the Javascript code (though client.js has 
support for it). I did see some batch navigation code in picker.js. It 
looks like Launchpad code, not web service code, but it may have to be 
changed for the same reasons as the web service needs to be changed.
Leonard



Follow ups

References