openstack team mailing list archive

Thread
Date
Re: Queue Service, next steps

To: Raphael Cohn <raphael.cohn@xxxxxxxxxxx>
From: Eric Day <eday@xxxxxxxxxxxx>
Date: Sun, 27 Feb 2011 11:46:22 -0800
Cc: openstack@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTinB8b2440RGuEyTpgjnb_RUZjFTBVp1fEdswr5g@mail.gmail.com>
User-agent: Mutt/1.5.20 (2009-06-14)
Hi Raphael,

On Sun, Feb 27, 2011 at 11:18:35AM +0000, Raphael Cohn wrote:
>    OpenStack's QueueService seems very interesting. As we have an existing
>    message queue implementation, we'd be happy to help you guys out. We're
>    about making messaging cloud-scale, so that everyone benefits.

Thank you! We're certainly looking to include as many community members
as we can to ensure this is a successful project. You expertise and
participation would be very much appreciated!

>    However, it worries us that you're planning to implement a REST API for
>    messaging. Message queuing is fundamentally asynchronous; this is one of
>    the reasons StormMQ got started, as we found that approaches that use it
>    (eg SQS) suffer from some major weaknesses:-
>    - They're too slow;
>    - They can't handle sustained volumes
>    - Higher-level needs, eg fanout, selective pub-sub and transactions, are
>    an awkward, if not impossible, fit

I certainly agree, HTTP is not an ideal protocol for high-performance
messaging. Some features may be awkward in HTTP, but almost anything
is possible. As you'll note on the queue service specification page,
a pluggable protocol is one of the main requirements. The REST API
is the first since this is the easiest protocol for most folks to
understand and get involved with, it is by no means the primary or a
first-class protocol. For example, I mention other binary protocols
to look at implementing for higher performance once we get the REST
API off the ground.

HTTP though, if done correctly (pipelining, binary content-types,
...), can provide decent throughput that is sufficient for a wide
range of applications. It will always be restricted by the plain-text
request/header envelope, and this is where binary protocols will excel.

Also, not all users and use cases of the queue service will need
to prioritize on high throughput. The overhead of the HTTP protocol
parsing may be insignificant for some, and instead the accessibility
of the service via HTTP in their environment (web apps, browser,
etc.) may be much more important than high throughput. Accessibility,
especially now in a very RESTy web/cloud world, is very import.

>    There are a hoard of technical reasons why HTTP, superb as it is for
>    request-response architectures, makes a poor backbone for messaging (some
>    of the team behind StormMQ implemented one of the first banking-scale REST
>    architectures).
> 
>    For example, implementations that need to send or consume lots of data,
>    and are only interested in a subset whose filter criteria changes over
>    time. Syslogging, for example. Imagine a dynamic cloud, where servers come
>    and go - and centralised logging systems and alerts need no configuration,
>    because they use queuing. Under load (eg hack attempts on your server
>    firewalls generate 1000s of log messages) it mustn't fail, just go a bit
>    slower. StormMQ use AMQP internally for our own log management for that
>    reason.

Understood, and much of this can be accomplished with horizontally
scaling architectures. As I touched on before and mentioned on
the wiki, HTTP is only one interface in. The internal communication
protocol for scaling out zones and clusters will not be HTTP long term,
and instead a much more efficient, async, and binary protocol. My
current thought is to use Google protocol buffers or Avro for this,
but this is up in the air (something we won't get to for at least a
couple months). Since we're using Erlang, we may even use the native
Erlang message passing if we're on a trusted network.

>    First up, AMQP isn't actually very complex at the level of an application
>    developer. Indeed, with a good library (like ours) it's trivially easy.

Agreed, there are some great AMQP libraries out there that make it
seamless, but there are also some that do not. This wasn't my concern
with the complexity comment though.

>    The apparent complexity comes becomes of unfamiliarity, both with concepts
>    and with use; no different to HTTP when it first came in (and we saw a
>    plethora of weird ways of using it and misunderstood criteria for headers,
>    etc). AMQP's highly suited to high-latency, unreliable links. That's why
>    Smith Electric vehicles use it to connect all their delivery trucks using
>    dodgy 3G links - and still gather 10,000s of items of data a second. The
>    AMQP protocol, particularly 1.0, make it's extremely clear how and when to
>    recover from failure. Indeed, AMQP's approach is failure happens - so deal
>    with it. HTTP on the other hand, has no such level of transactionality.

For the complexity concern, my main point is that in order to use
a queue, you need a channel, exchange, queue, and a binding between
an exchange/queue. This can be made fairly trivial by libraries you
mentioned, but there are a lot of objects and relationships to keep
in sync in a distributed system. The OpenStack queue service takes a
fundamentally different approach and requires no queue setup before
you can put a message into it. Queues (and accounts) are transient,
when a message is inserted into a queue, or when a consumer is
waiting on queue, it comes into existence. When the queue is empty,
it disappears. This allows you to easily create temporary queues
without worry of race conditions between producers and consumers.

As for my comment on AMQP's suitability for highly-latent or
unreliable links, it is primary directed towards the 7-way handshake
for consumers, and 4-way handshake for producers (both on top of one
RTT for the TCP handshake). Once these connections are established the
protocol is very efficient, but this doesn't help with unreliable links
or environments where persistent connections are hard or impossible
to maintain. AMQP will certainly work in these environments, but it
seems it is much more suited for reliable links where the handshake
isn't required as often.

With the proposed OpenStack queue service REST API, there will only
be one RTT for both producers and consumers (on top of one RTT for
TCP). A producer will make a PUT request with a 201 Created response. A
consumer is a GET or POST with response body. All authentication,
queue destination, and other metadata will be included in the request,
rather than building up a stateful channel through the handshake.

Cloud, and especially module, use cases bring much higher latency than
is typically seen in clustered environments. Short-lived connections
are always possible depending on the developer or environment (not
just due to unreliable links, for example connection caching may be
difficult or impossible). This is why an emphasis was put on stateless
communication with minimal round trips.

>    Second up, more importantly, StormMQ do not provide a REST API as an
>    alternative to AMQP. It's to provide features that are nothing to do with
>    message queuing - dynamically slicing up your cloud, for instance, or
>    managing environments to allow exact reproduceability or checking in to
>    source control your config. We'd be interested in providing a REST API if
>    there's the demand. AMQP does support multi-tenancy - we do it.

We plan to address these issues with this project, especially
multi-tenancy and multi-zone interaction. We need this to not only
handle the simple use cases, but to also run a public cloud service.

>    To assist, pragamatically, we'd like to donate as open source our upcoming
>    C and Java clients for AMQP 1.0, and help sponsor Python, Perl, PHP and
>    Ruby ones off the C code, so that there is as wide as possible opportunity
>    for people to use messaging.

Thanks! Before being able to fully leverage these, we'll also need an
AMQP binding, which to be honest I've given very little thought. Once
we have a solid queue "kernel" this will be easier, but I'm certainly
keeping AMQP semantics in mind. We are also using RabbitMQ for the
Nova project using the carrot Python module. It might be interesting
to see how your clients compare and if they may benefit that project.

>    I'd strongly encourage you to get involved in the AMQP working group so if
>    there's needs that are not met by AMQP, they can be addressed. The working
>    group is really keen to encourage an open, widely adopted standard for
>    AMQP; they'd like it to be the HTTP of messaging. Many of the features I
>    see proposed for OpenStack are features in AMQP - and AMQP has spent a lot
>    of time working out the kinks in edge cases and making sure they'd work
>    with the legacy - JMS, TIBCO and the like.

I'll certainly consider it, but I'd first like to get a functional
service up and running to see how these ideas (distributed hashing,
stateless, transient queues) hold up and then we can see what features,
if any, would make sense as an AMQP proposal.

Thanks again for you input! I'm looking forward to further discussion
and StormMQ's participation.

-Eric
Follow ups

Re: Queue Service, next steps
From: Raphael Cohn, 2011-02-28
References

Re: Queue Service, next steps
From: Raphael Cohn, 2011-02-27