← Back to team overview

openstack team mailing list archive

Re: Architecture for Shared Components

 

On Aug 2, 2010, at 7:30 AM, Michael Gundlach wrote:

Hi Jorge,

On Sat, Jul 31, 2010 at 1:22 PM, Jorge Williams <jorge.williams@xxxxxxxxxxxxx<mailto:jorge.williams@xxxxxxxxxxxxx>> wrote:
Guys,

I like this idea a lot.  I hadn't thought about the concept of using a language binding to communicate with upstream proxies, but it makes sense.

Just to clarify: I am not advocating a downstream (in the request chain) server calling back to an upstream server.  Instead, if both servers need to do something re: authentication, they would both call "sideways" to an authentication service sitting out-of-request-chain that can answer their query and hang up.


Hmm..  Let me see if I understand what you're saying.  Correct me if I'm wrong here.  You're still advocating a proxy approach where an HTTP request is sent from one proxy to another... (Parton my txt drawings)


                            [  SSL Term  ]
                                  |
                                  v
                            [   Cache    ]
                                  |
                                  v
                            [    Auth    ]
                                  |
                                  v
                            [ Rate Limit ]
                                  |
                                  v
                            [API Endpoint]
                                  |
                                  v
                                 ...

...but you are proposing that individual proxies can make sideways calls to make additional service requests...



                            [  SSL Term  ]
                                  |
                                  v
                            [   Cache    ]
                                  |
                                  v
                            [    Auth    ]--->[ IDM SERVICE ]
                                  |
                                  v
                            [ Rate Limit ]
                                  |
                                  v
                            [API Endpoint]
                                  |
                                  v
                                 ...


If so, that's exactly along the lines of what I was thinking.


(It sounded like you maybe were advocating server A proxying to server B proxying to server C, then C calls A to asks a question, hangs up after getting the answer, then proxies to server D.  That makes A have to behave as both a proxy and a regular server, probably makes the scaling analysis a little trickier, and if nothing else makes the diagrams harder to draw :) )


Yes, that's what I was saying.   First, let me note that a proxy always acts as both a server and a client.

In fact, RFC2616 defines a proxy specifically in those terms:

"An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients."

So I think it's perfectly reasonable for a proxy to handle requests from downstream servers.  Most Caching proxies work this way.  Knowledge of what needs to be purged usually resides on the backend.

                            [  SSL Term  ]
                                  |
                                  v
              +------------>[   Cache    ]
              |                   |
              |                   v
              |             [    Auth    ]--->[ IDM SERVICE ]
              |                   |
              |                   v
              |             [ Rate Limit ]
           (Purge)                |
              |                   v
              |             [API Endpoint]
              |                   |
              |                   v
              |                  ...
              |                   |
              |                   |
              |                   |
              |                   |
              +-------------------+


For example, say a user issues a command to delete a server.  We'll need to purge every representation (XML, JSON, XML GZip, JSON GZip) of that server from the front end cache.  I suppose we could detect the delete operation at the caching stage, but that means having a very customized cache, I'd like to reuse that code for different APIs.  What's more certain events may trigger cache purges from the backend directly, say a server transition state from  "RESIZE" to "ACTIVE".   I really don't see how we can avoid these downstream communications entirely.


FWIW, this is Google's architecture in all the services that I was familiar with when I worked there -- a stack of servers in the request chain, each handling the minimum amount of work, and calling out sideways to other services as needed.  If two servers needed to ask about authentication, they could call out to Gaia the authentication service, get an answer, hang up on Gaia, and continue servicing the request.  What's great about this is that it becomes so normal as a way to solve the problem that it becomes a design pattern: you can "smell" problems with the architecture if the owner can't draw it on a whiteboard as a stack of servers, from netscaler down to final request server, with some arrows out sideways to caches, Gaia, GFS, etc.

Cool.


 Being able to purge something from an HTTP cache by simply making a "purge" call in whatever language I'm using to write my API is a win.  That said, I'm not envisioning a lot of communication going upstream in this manner.  An authentication proxy service, for example, may need to communicate with an IDM system, but should require no input from the API service itself.  In fact, I would try to discourage such
communication just to avoid chatter.

I want to push back a little on that point -- I don't know think we should optimize for low network chatter as much as for simplicity of design.

I agree with that. I was talking in terms of upstream communication, like the purge example above.  I would avoid this where possible.  For example, nothing in the backend should have to touch the Rate Limiting proxy. This not only increases chatter but also complicates the design.

The example that started this thread was authentication having to happen at two points in the request chain.  If we tried to eliminate the deeper of the two requests to the auth server in order to reduce network chatter, the trade off is having to bubble state up to the shallower server, making that server more complicated and making it harder to separate what each server in the chain does.  If we find that we're saturating the network with calls to a particular service, only then do I think we should start looking at alternatives like changing the request flow.

In cases where this can't be avoided  I would require the proxy services expose a rest endpoint so we can take advantage of it even if a binding isn't available.


I would definitely advocate for using REST for *all* our service communication unless there were a really strong case for doing otherwise in an individual case.  Makes the system more consistent, makes it easy to interface with it if you need to, makes it easy to write a unified logging module, lets us easily tcpdump the internal traffic if we ever needed to, etc.  Putting a language binding on top of that is just added convenience.


Agreed.


jOrGe W.

Follow ups

References