openstack team mailing list archive

Thread
Date

Re: [keystone] v3 API draft (update and questions to the community)

To: openstack@xxxxxxxxxxxxxxxxxxx
From: Adam Young <ayoung@xxxxxxxxxx>
Date: Tue, 12 Jun 2012 12:21:58 -0400
In-reply-to: <BD35DBE010D5B64589500D43F775F236158933DC@BY2PRD0510MB376.namprd05.prod.outlook.com>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

On 06/12/2012 04:24 AM, Gabriel Hurley wrote:

Mark,

Apparently you must have missed my lightning talk at the Essex summit... ;-) (http://gabrielhurley.github.com/slides/openstack/apis_like_orms/index.html)

Filtering, pagination, and many other API features are *critical* for a rich dashboard experience. If you want to talk specifics, the entire Horizon team would be happy to have a long chat with you.

Yes and no. The reality is that it is a trade off. Server side, youpay by doing a network round trip, and having to determine ahead of timethe mechanisms for sorting, caching, paging, etc.

The real problem is that the parsing of the results can blow out thestack space in the browser.


That said, we have also considered the case you propose where you effectively "request everything and handle it on the client-side"... however, I see that as a tremendously lazy solution. On the service-provider end you have access to powerful database methods that can do these operations in fractions of the time the client-side can (especially with good indexes, etc.). And if you've ever worked in mobile applications you'll know that minimizing data across the wire is crucial. The only argument I've heard in favor of that is basically "it's easier for us not to add API features".

At the expense of loading your Database. Serverside paging andfiltering both require one of two things: caching or additionalDatabase queries, and both increase your server footprint. For smalldatasets, or for limited queries, this is not a problem, but forscalability you want to limit the work you do on the server.

For Keystone using the LDAP backend, caching and pagination areextremely expensive, and not something I would like to support. an LDAPquery is not guaranteed to come back in any particular order, so youcan't just do the SQL trick of executing the query at offset + windowsize. You have to do the equivalent of a Cursor, and this placesserious load on the LDAP server, the Keystone server, and possiblyimpacts other apps dependand on LDAP.


To speak on the specific feature of pagination, the problem of 'corruption' by simultaneous writers is no excuse for not implementing it. You think Google, Facebook, Flickr, etc. etc. etc. don't have this problem? If you consume their feeds you'll notice you can fetch offset-based pagination with ease. You'd never expect to see a navigation element at the bottom of Google search results that said "take me to results starting with the letter m".

There is a major difference. We are working with data that has to beACID. Google, Facebook and flickr do not. Before you migrate a VM,you need to know if the host meets the criteria for the VM. If itchanges between when you check and when you reserve the space for theVM, you have just over committed. "Get it right eventually" does notwork for management apps.


None of this is a case of "someone might use it". The Horizon team has been loudly asking for these features for 8+ months now. And not just from Keystone but from all the projects. I have a list a mile long of API features we need to really deliver a compelling experience. I was just adding some items to it today, in fact.

The rest of your points I have no strong feelings on and generally agree, but when it comes to API features... I feel *very* strongly.

Note that I am not saying "don't do pagination" as I agree, it isessential for good user experience. What I am stating is that we needto be smart about the techniques and technologies we choose, as there isalways an upside and a downside.


All the best,

     - Gabriel

-----Original Message-----
From: openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx
[mailto:openstack-
bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of
Mark Nottingham
Sent: Monday, June 11, 2012 10:27 PM
To: Joseph Heck
Cc: openstack@xxxxxxxxxxxxxxxxxxx (openstack@xxxxxxxxxxxxxxxxxxx)
Subject: Re: [Openstack] [keystone] v3 API draft (update and questions to
the community)

On 11/06/2012, at 6:58 AM, Joseph Heck wrote:

First - what's the current thought of support for PATCH vs PUT in updating

REST resources? Are there any issues with clients being able to use a PATCH
verb? It's not something I'm super familiar with, so I'm looking for feedback
from the community here. Ideally, I'd like to support the semantics of the
PATCH HTTP verb, and possibly just assert no support for the PUT verb to be
clear about intended functionality. Is that going to throw anyone for a loop?

I answered a question about PATCH before; don't want to repeat myself, but
it should be workable. Happy to chat more about it if you have specific
questions.

Second - filtering/searching for resources. The draft includes a section

labelled "Query By Name", which is probably mis-labelled, as it's intended to
cover the general idea of passing in query parameters to general listing
resource endpoints to filter the result set. The API endpoints across all the
resources are defined as plurals, with the idea that specificity comes later in
the URI (for referencing a single resource), or that we could add on these
query parameters to restrict/filter by resource type.

I'm in the middle of doing some log analysis and other research about how
the APIs are used at Rackspace. It's too early to share results (although I do
intend to, in some form, because the idea is to inform future API design), but
one of the things that's very noticeable is how (extremely!) little pagination
and filtering seem to be used in anger.

In fact, if you take a look at the libraries, you'll find that they often don't use
or even support filtering or pagination; e.g., libcloud doesn't, AFAICT.

So, it's worth having a think about what the use cases actually are; both
filtering and pagination are usually ways to save one or more of:
   a) client-side work
   b) server-side work
   c) bandwidth / latency

One interesting exercise would be to estimate the largest number of users
(or whatever else you'd be listing) that a reasonable deployment would put
in a single response, triple it, do a dummy serialisation in JSON, and then gzip
it, so that you can estimate the size, see how long it takes to parse on the
client, etc.

>From what I've seen (in OpenStack as well as in other APIs that have
nothing to do with Cloud), API designers tend to overestimate the utility of
pagination and especially filtering ("somebody might use it"), but users just
ignore them, doing all of the work on the client side, except in extreme
circumstances (e.g., VERY large responses / very high latency).

Unless you have strong use cases for them, I'd be inclined to drop them; they
increase implementation, QA, and documentation complexity, as well as
making the API harder to understand. YMMV, of course :)

The other issue with pagination is that a relative paged approach (like you're
taking) means that readers' views of the complete set of items can be
corrupted by simultaneous writers. While in some instances this is just an
annoying UI bug (missing or duplicated entries on different pages, lower
cache hit rates), in some circumstances it can be more serious (clients not
understanding the true state of the system, and making bad decisions as a
result).

Again, a source of bugs and complexity (we came up with one approach to
this with archived feeds in RFC5005, but it's pretty heavyweight, especially
for use cases like this).

Hope this helps,

P.S. the X-Subject-Token stuff is breaking HTTP; you need to either put the
token (or a facsimile for it) in the URL, or put Vary: Subject-Token in EVERY
response those resources generate. The former is preferred; this is over TLS,
right? Sorry I didn't see that earlier.

P.P.S If it's not too late, drop the X- from that header!
<http://tools.ietf.org/html/draft-ietf-appsawg-xdash-05>


--
Mark Nottingham   http://www.mnot.net/




_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp



_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Follow ups

Re: [keystone] v3 API draft (update and questions to the community)
From: Jay Pipes, 2012-06-12

References

[keystone] v3 API draft (update and questions to the community)
From: Joseph Heck, 2012-06-10
Re: [keystone] v3 API draft (update and questions to the community)
From: Mark Nottingham, 2012-06-12
Re: [keystone] v3 API draft (update and questions to the community)
From: Gabriel Hurley, 2012-06-12