← Back to team overview

launchpad-dev team mailing list archive

Re: micro services: HTTP authentication in the datacentre and default protocol.

 

On 3 June 2011 13:24, Robert Collins <robertc@xxxxxxxxxxxxxxxxx> wrote:
> Bah. Keyboard fail - mailus interruptus.
>
> On Fri, Jun 3, 2011 at 2:33 PM, Robert Collins
> <robertc@xxxxxxxxxxxxxxxxx> wrote:
>
> Up to this bit was fine.
>
>> That means we're now selecting amongst HTTP based protocols for a
>> default protocol to talk to microservices.
>
> Now starting over...
>
> I have some thoughts here (specific to the default protocol/stack):
>
> Must:
> * handle binary data (librarian, gpg signing requests,
> * support multiple client languages (e.g. cannot be bound even over
> the network to just zope clients, or even just python)
> * Support graceful resets (where we have a status url that controls
> whether HAProxy feeds traffic to the service
>
> Should:
> * Allow easy backwards compatible evolution of apis. (Nb: I say should
> here because one can always publish a new API to do migrations. But
> there may be nicer ways).
> * Permit OOPS integration
>
> Must Not:
> * Require client side caching
> * Require client side run-time compilation (but a static thing during
> deploy is ok).
>
> Beyond these, I think the performance of the stack is important to us.
> While we're not shipping massive volumes of data around the faster
> things are, all else considered, the better.
>
> Now, theres some debate around REST vs XMLRPC vs google protobufs as
> options - but these things aren't all on the same spectrum.
>
> Specifically, REST is compatible with XML - so we actually have
> several dimensions to consider:
>  * data serialisation (xml/json/...)
>  * object access style (REST aka url-to-object/RPC aka
> namespace-defines-functions)
>  * parameter encoding for queries (url parameters/formdata-in-a-POST)
>  * mapping to HTTP verbs (GET/POST/PUT/PATCH)
>
> On some of these I have some opinions :) I seek plenty of discussion
> though! Also note that whatever we choose as a default may not affect
> what we do for things talking to the core service surrounding our big
> postgresql server - that has enough stacks already running on it; I
> worry about the idea of adding in *yet another* protocol mapping
> there.
>
> So generally speaking, I think json or bson is probably preferrable to
> xml; I'm inclined towards RPC url mapping because working
> object-at-a-time is terribly terribly slow and we have /so many/
> multiple-object-access cases that doing url mapping to talk about
> things seems like a pessimism (but perhaps lazr.restful is just too
> fresh in my mind).
> For parameter encoding, its really nice being able to bench test
> readonly calls by hand with crafted URLs. This doesn't help with
> mutating calls, or at least, not so much. And I think using GET for
> readonly requests ties nicely into adhoc experimentation; for mutation
> I think that for most of our use cases POST - that is, making a method
> call, maps most cleanly into our domain.

(The name 'rest' almost needs to be blacklisted, because it is so
vague in general and so badly associated with lazr.restful in
Launchpad.)

I think sitting an RPC model on top of HTTP constrains you in ways
that are not useful: the http layer no longer distinguishes readonly
or idempotent operations; you may not so easily be able to poke things
by hand; it encourages clients to be written in terms of blocking
one-at-a-time calls; you don't get to have an opinion about what URL
scheme is useful.  XMLRPC even more so, because you no longer have the
choice to say a particular operation would be most efficiently handled
by sending a binary.

Mutating operations can be poked at with curl or wget if you use
standard form encoding, and plain HTTP authentication (basic or
digest.)

I think one 'should' is that the network performance model should be
obvious from looking at the code: restful tried to abstract this in a
bad way and made it obscure.  In other words it should be clear when a
round trip happens.

Perhaps you should add a 'should' that nothing about the model should
require or encourage object-at-a-time protocols.

I don't think there is much of a correlation, certainly not a
necessary connection, between RPC URL mapping and object-at-a-time
protocols.  Yes, if the only interface exposed is a GET /bugs/%d you
will have this problem, but equally so if you expose only a
GetBug(bug_id) RPC.  The key point is to first think about what
performance is needed for whatever case, and then secondly to have a
transparent performance model.  GET /bugs/1;2;3;4 (or for long lists,
the same packed into a POST) is as good as GetBugs([1, 2, 3, 4]).

Martin


References