launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #06758
Re: brainstorm: cheaper API collection iteration
My understanding, may be way off.
Pages in collections are just links, so you can generate what you
like. You can generate the bare link to the collection to get the first
page, and then when that page is returned it would say
next_page_link:
"https://api.launchpad.net/devel/some_collection?start_key=endkey_of_this_collection
then when the next page is needed launchpadlib simply hits that URL.
Therefore I think this would be fairly straightforward to change in the
current webservice. It also looks as though just changing
lazr.batchnavigator will get most of the way there, so it may be that
fixing the web UI fixes the API too.
In general, this is right. lazr.restfulclient follows the URL it finds
in next_collection_link without caring what that URL is. launchpadlib
doesn't mess with the collection links at all, and lazr.restfulclient
never uses previous_collection_link. So if you come up with a better way
to paginate records, you can just change next_collection_link and the
client will adapt.
I know of two exceptions, both optimizations in the code that handles
slices (_get_slice() in lazr.restfulclient).
1. If you ask for a slice like launchpad.bugs[:76], lazr.restful gets
the first page (which happens to have 75 entries), and then this code runs:
if more_needed > 0 and more_needed < first_page_size:
# An optimization: it's likely that we need less than
# a full page of entries, because the number we need
# is less than the size of the first page we got.
# Instead of requesting a full-sized page, we'll
# request only the number of entries we think we'll
# need. If we're wrong, there's no problem; we'll just
# keep looping.
page_url = self._with_url_query_variable_set(
page_url, 'ws.size', more_needed)
So, we take next_collection_link, and we hack ws.size to only give us
(in this case) one additional entry. We don't need another 75 entries,
we only need one.
I think this will continue to work no matter what next_collection_link
looks like, so long as ws.size continues to work. Worst case, we can
simply remove the optimization.
2. When you ask for a slice like 'launchpad.bugs[70:200]', this code runs:
# No part of this collection has been loaded yet, or the
# slice starts beyond the part that has been loaded. We'll
# use our secret knowledge of lazr.restful to set a value for
# the ws.start variable. That way we start reading entries
# from the first one we want.
first_page_size = None
entry_dicts = []
page_url = self._with_url_query_variable_set(
self._wadl_resource.url, 'ws.start', start)
This "secret knowledge of lazr.restful" would be invalidated by the
change. We could stop supporting this syntax in later versions of
launchpadlib, or we could just remove the optimization: make
lazr.restfulclient load subsequent pages from the beginning until it has
200, and then perform the slice.
I did a quick search of Launchpad for references to ws.start and
ws.size. It shows up a lot in tests, but that's about it. In particular,
it doesn't seem to be used in the Javascript code (though client.js has
support for it). I did see some batch navigation code in picker.js. It
looks like Launchpad code, not web service code, but it may have to be
changed for the same reasons as the web service needs to be changed.
Leonard
Follow ups
References