← Back to team overview

launchpad-dev team mailing list archive

Re: micro services: HTTP authentication in the datacentre and default protocol.

 

Hi,

On Tue, Jun 7, 2011 at 12:35 AM, Robert Collins
<robert.collins@xxxxxxxxxxxxx> wrote:
> On Tue, Jun 7, 2011 at 9:48 AM, Jamu Kakar <jkakar@xxxxxxxx> wrote:
> So, for clarity, the reasons to *consider* skipping rabbit are:
>  * its a PITA to bring up reliably in a test environment.
>  * something else functionally equivalent but simpler comes along
>  * something with more facilities and same complexity comes along
>
> Those seem like pretty darn good reasons to consider skipping *any*
> commodity item.

The first reason is definitely awkward.  The second two are bit harder
because you don't know when they'll happen.  But anyway, I get what
you're saying here.

>> That said, I don't have enough experience with RabbitMQ to say,
>> "you're thinking about this wrong", or conversely, "yes, this is a
>> serious issue".  I am slightly concerned that overengineering could
>> lead to a suboptimal solution.  The impression I get from the outside,
>> and maybe I'm totally off the mark, is that Launchpad often chooses a
>> hard path to Do Things Right(tm), and then the end result is that
>> everything is hard.
>>
>> I also wonder how many of these services need to talk to each other.
>> Maybe you could run many RabbitMQ instances and use them for
>> particular tasks?  For example, a bug-focused queue for bug-related
>> operations, a code hosting-focused queue for code-related operations,
>> etc.  If one of them falls over you end up with degraded service, as
>> opposed to losing everything.  I don't know how viable that is, since
>> I don't really understand what the topology of micro-services will
>> look like.
>
>> Also, is there something that will solve the HA issues you've brought
>> up in the pipeline for RabbitMQ?  Maybe it's something worth
>> contributing to and/or living without for some time while support for
>> these issues gets baked in?
>>
>> How do other people use RabbitMQ and sleep at night?
>
> Those are good questions to ask. On the HTTP vs Rabbit space, I think
> the decoupling between service point and implementation is a useful
> thing to have, but if you look at the list of things we need in place
> to consider a microservice maintainable -
> https://dev.launchpad.net/ArchitectureGuide/ServicesRequirements -
> most of those are not impacted by changing the protocol from HTTP+foo
> to amqp.
>
> Launchpad has a history of awkard implementation decisions - yes thats
> true. However I think many of them are due to the complexity of
> analyzing scaling and performance (consider - predict which bottleneck
> will we hit next in codehosting: CPU? memory? network bandwidth to the
> main host? disk space? fs locks? concurrent IO rate to disks?...) and
> then go back 6 years and predict which design will handle all the
> bottlenecks gracefully.
>
> It would be easy to throw stones, but we get 20-20 vision in
> hindsight. I think that the folk (which includes me for some decisions
> - waaaay back :)) did their best to analyse things at the time.
> However I think they over-analysed: many problems our past selves
> designed for did not occur, and many problems they did not design for
> have occurred.

Stones aren't really productive, my goal in asking those questions
wasn't to blame in anyway.  We've all made good and bad decisions, and
we'll do both again in the future, it's part of the fun of what we
do. :)  Anyway, I just wanted to bring up the thought that maybe things
will be fine if we reduce the number of concerns a given solution must
deal with.

> So, I want us to simultaneously:
>  - be able to diagnose problems /fast/
>  - be able to recover from operational issues rapidly
>  - look after our users data
>  - be able to modify the design rapidly to deal with the things we
> have not designed for.
>  - have the lowest implementation cost to meet these four things
>
> To that end, saying 'lets start with rabbit without using its
> persistence features':
>  - lets us leverage the ops team familiarity with rabbit for
> diagnosis, logging, capacity planning
>  - and their experience with it for recovering after it breaks
>  - avoids concerns about data integrity or storage
>  - can be modified easily to permit persistence (add HA) or to move to
> a less cumbersome implementation
>  - looks pretty cheap to do (we have cookie-cutter deployment
> knowledge for http stacks).

This is all sensible.  Small steps are good.

Thanks,
J.


References