← Back to team overview

openstack team mailing list archive

Re: Push vs Polling (from Versioning Thread)

 

+1 Dragon


________________________________________
From: openstack-bounces+sandy.walsh=rackspace.com@xxxxxxxxxxxxxxxxxxx [openstack-bounces+sandy.walsh=rackspace.com@xxxxxxxxxxxxxxxxxxx] on behalf of Monsyne Dragon [mdragon@xxxxxxxxxxxxx]
Sent: Thursday, October 27, 2011 4:14 PM
To: George Reese
Cc: <openstack@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Openstack] Push vs Polling (from Versioning Thread)

On Oct 27, 2011, at 11:38 AM, George Reese wrote:

> Sent from my iPhone
>
> On Oct 27, 2011, at 11:26, Bryan Taylor <btaylor@xxxxxxxxxxxxx> wrote:
>
>> On 10/27/2011 10:36 AM, George Reese wrote:
>>
>>> #3 Push scales a hell of a lot better than having tools polling a cloud
>>> constantly. It doesn't matter whether it is polling the API, polling a
>>> feed, or polling a message queue. Polling is one of the most unscalable
>>> things you can do in any distributed systems scenario. Calling it a feed
>>> doesn't magically solve the problem. Actually, it's quite hard on its
>>> own in an IaaS scenario and has scaling issues independent of the
>>> polling issue.
>>
>> I disagree. The web was designed specifically to solve the distributed scaling problem and it's based on HTTP polling. It scales pretty well. The argument against polling not scaling inevitably neglects using caching properly.
>>
>
> The web was not designed to deal with a bunch of clients needing to
> know about infrastructure changes the instant they happen.

True.  This whole issue is the reason Nova's existing notification system  is designed  as a push system.  Currently it's used to push error notifications and usage info, but there is no reason it could not eventually also provide notifications to end users.  After watching demos of large cloud users where they were polling apis to see when their instances were ready (and often spinning up new ones when that didn't happen fast enough)
I kept that use case in mind when coding the notifications system.


> And API data should not be cached. The Rackspace API used to do that,
> and it created a mess.
>
>> Push doesn't scaled because it requires the server to know about every client and track conversational state with them.
>
> No, it doesn't. You push changes as they occur to a message queue. A
> separate system tracks subscribers and sends them out. There is no
> conversational state if done right.

Indeed. this is how the notifications system is/can work right now. If you turn notifications on, nova pushes them to a rabbit queue.  A separate app, namely a hub using the standard PubSubHubbub protocol, (plus an external rabbit queue -> feed generator app we wrote, called Yagi) manages the subscriptions and pings the subscribers.

>
>> If you need reliability, this requires persisting that conversational state. In order to allow this to happen you have to have some kind of registration protocol for clients. If some fraction of those clients are flaky, the conversational state tracking will kill you because each client consumes resources and so flaky clients = resource leak.

Existing PSH hubs are designed to handle large numbers of potentially unreliable clients.

>>
>> Push wins when you need very low latency delivery, high message throughput to individual consumers, or server side guarantees of delivery to individual consumers, but not for scaling to a large number of clients in a climate of an elastic infrastructure.
>>
>>> Push notifications are the only mechanism for solving the scaling issue.
>>> You push any changes to a message queue. Agents pick up the changes and
>>> send them on to subscriber endpoints. Not that hard.
>>
>> Not that hard with a few fairly reliable clients. Very hard with a web scale set of unreliable clients while I simultaneously need to scale the back end.
>

Actually, we are already implementing this at 200+node scale in nova, since this is  how we are handling the collection of  usage data for billing.  At the moment is seems to be working reasonably well. We are not using the  PSH hubs atm, since we are pushing to a few internal consumers via AtomPub, and don't need the complex subscription management, but from our point of view, pinging a hub is no different from what we already do, and the hub handles the subscription details.

It would be nice to one day support notifications of end-users, as I think it would be of great benefit to them. There's work that would need to be done around hub/Yagi auth, and I think there is some bigger fish to fry, nova functionality-wise at the moment, but it is something to keep in mind.

--
        Monsyne M. Dragon
        OpenStack/Nova
        cell 210-441-0965
        work x 5014190


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


References