← Back to team overview

openstack team mailing list archive

Re: Architecture for Shared Components

 

Hi Jorge,

I think we may not be on the same page here. I can't speak for what
Michael meant, but this is what I had in mind:

http://oddments.org/tmp/os_module_arch.png

This removes the intermediate proxies and instead relies on the
scaling out of API endpoint and the services it uses. Different
services or different parts of the same service could consume the
same API/service. See my original email for my reasoning, but the
main ones are to keep the APIs consistent across different parts of
a service and to reduce the number of hops a request must go through.

The Auth and Cache APIs could be backed by any service, in this case
it is LDAP and memcached. The service provider could be entirely in
process (hard coded test module) or a fully distributed network service
(memcached). This modularity makes it easy to swap implementations
in and out, or to adapt the application to existing architectures.

If we find we do need an extra proxy layer because caching or SSL
termination becomes too expensive within the endpoints, we can
easily write a new service layer utilizing the same APIs to provide
this. It would be nice to keep these proxies optional though, as some
installations will not require them.

-Eric

On Mon, Aug 02, 2010 at 03:31:46PM -0500, Jorge Williams wrote:
>    On Aug 2, 2010, at 7:30 AM, Michael Gundlach wrote:
> 
>      Hi Jorge,
>      On Sat, Jul 31, 2010 at 1:22 PM, Jorge Williams
>      <jorge.williams@xxxxxxxxxxxxx> wrote:
> 
>        Guys,
>        I like this idea a lot.  I hadn't thought about the concept of using a
>        language binding to communicate with upstream proxies, but it makes
>        sense.
> 
>      Just to clarify: I am not advocating a downstream (in the request chain)
>      server calling back to an upstream server.  Instead, if both servers
>      need to do something re: authentication, they would both call "sideways"
>      to an authentication service sitting out-of-request-chain that can
>      answer their query and hang up.
>       
> 
>    Hmm..  Let me see if I understand what you're saying.  Correct me if I'm
>    wrong here.  You're still advocating a proxy approach where an HTTP
>    request is sent from one proxy to another... (Parton my txt drawings)
>                                [  SSL Term  ]
>                                      |
>                                      v
>                                [   Cache    ]
>                                      |
>                                      v
>                                [    Auth    ]
>                                      |
>                                      v
>                                [ Rate Limit ]
>                                      |
>                                      v
>                                [API Endpoint]
>                                      |
>                                      v
>                                     ...
>    ...but you are proposing that individual proxies can make sideways calls
>    to make additional service requests...
>                                [  SSL Term  ]     
>                                      |            
>                                      v            
>                                [   Cache    ]     
>                                      |            
>                                      v            
>                                [    Auth    ]--->[ IDM SERVICE ]
>                                      |
>                                      v
>                                [ Rate Limit ]
>                                      |
>                                      v
>                                [API Endpoint]
>                                      |
>                                      v
>                                     ...
>     
>    If so, that's exactly along the lines of what I was thinking.  
> 
>      (It sounded like you maybe were advocating server A proxying to server B
>      proxying to server C, then C calls A to asks a question, hangs up after
>      getting the answer, then proxies to server D.  That makes A have to
>      behave as both a proxy and a regular server, probably makes the scaling
>      analysis a little trickier, and if nothing else makes the diagrams
>      harder to draw :) )
> 
>    Yes, that's what I was saying.   First, let me note that a proxy always
>    acts as both a server and a client.
>    In fact, RFC2616 defines a proxy specifically in those terms:
>    "An intermediary program which acts as both a server and a client for the
>    purpose of making requests on behalf of other clients."
>    So I think it's perfectly reasonable for a proxy to handle requests from
>    downstream servers.  Most Caching proxies work this way.  Knowledge of
>    what needs to be purged usually resides on the backend.
>                                [  SSL Term  ]     
>                                      |            
>                                      v            
>                  +------------>[   Cache    ]
>                  |                   |
>                  |                   v
>                  |             [    Auth    ]--->[ IDM SERVICE ]
>                  |                   |
>                  |                   v
>                  |             [ Rate Limit ]
>               (Purge)                |
>                  |                   v
>                  |             [API Endpoint]
>                  |                   |
>                  |                   v
>                  |                  ...
>                  |                   |
>                  |                   |
>                  |                   |
>                  |                   |
>                  +-------------------+
>    For example, say a user issues a command to delete a server.  We'll need
>    to purge every representation (XML, JSON, XML GZip, JSON GZip) of that
>    server from the front end cache.  I suppose we could detect the delete
>    operation at the caching stage, but that means having a very customized
>    cache, I'd like to reuse that code for different APIs.  What's more
>    certain events may trigger cache purges from the backend directly, say a
>    server transition state from  "RESIZE" to "ACTIVE".   I really don't see
>    how we can avoid these downstream communications entirely. 
> 
>      FWIW, this is Google's architecture in all the services that I was
>      familiar with when I worked there -- a stack of servers in the request
>      chain, each handling the minimum amount of work, and calling out
>      sideways to other services as needed.  If two servers needed to ask
>      about authentication, they could call out to Gaia the authentication
>      service, get an answer, hang up on Gaia, and continue servicing the
>      request.  What's great about this is that it becomes so normal as a way
>      to solve the problem that it becomes a design pattern: you can "smell"
>      problems with the architecture if the owner can't draw it on a
>      whiteboard as a stack of servers, from netscaler down to final request
>      server, with some arrows out sideways to caches, Gaia, GFS, etc.
> 
>    Cool.
> 
>         Being able to purge something from an HTTP cache by simply making a
>        "purge" call in whatever language I'm using to write my API is a win.
>         That said, I'm not envisioning a lot of communication going upstream
>        in this manner.  An authentication proxy service, for example, may
>        need to communicate with an IDM system, but should require no input
>        from the API service itself.  In fact, I would try to discourage such
> 
>        communication just to avoid chatter.
> 
>      I want to push back a little on that point -- I don't know think we
>      should optimize for low network chatter as much as for simplicity of
>      design.  
> 
>    I agree with that. I was talking in terms of upstream communication, like
>    the purge example above.  I would avoid this where possible.  For example,
>    nothing in the backend should have to touch the Rate Limiting proxy. This
>    not only increases chatter but also complicates the design.
> 
>      The example that started this thread was authentication having to happen
>      at two points in the request chain.  If we tried to eliminate the deeper
>      of the two requests to the auth server in order to reduce network
>      chatter, the trade off is having to bubble state up to the shallower
>      server, making that server more complicated and making it harder to
>      separate what each server in the chain does.  If we find that we're
>      saturating the network with calls to a particular service, only then do
>      I think we should start looking at alternatives like changing the
>      request flow.
> 
>       
> 
>        In cases where this can't be avoided  I would require the proxy
>        services expose a rest endpoint so we can take advantage of it even if
>        a binding isn't available. 
> 
>      I would definitely advocate for using REST for *all* our service
>      communication unless there were a really strong case for doing otherwise
>      in an individual case.  Makes the system more consistent, makes it easy
>      to interface with it if you need to, makes it easy to write a unified
>      logging module, lets us easily tcpdump the internal traffic if we ever
>      needed to, etc.  Putting a language binding on top of that is just added
>      convenience.
> 
>    Agreed.
>    jOrGe W.



Follow ups

References