openstack team mailing list archive

Thread
Date

Re: High availability in openstack?

To: "konetzed@xxxxxxxxxxxxxxxxx" <konetzed@xxxxxxxxxxxxxxxxx>, openstack <openstack@xxxxxxxxxxxxxxxxxxx>
From: Joshua Harlow <harlowja@xxxxxxxxxxxxx>
Date: Thu, 18 Aug 2011 17:38:54 -0700
Accept-language: en-US
Acceptlanguage: en-US
In-reply-to: <4E4DA631.5070800@quixoticagony.com>
Thread-index: AcxeBCk4/O231q5XSs+4eWyVvPx8ZQABDYqu
Thread-topic: [Openstack] High availability in openstack?

Thanks,

It was along the lines of what I was thinking.

If messages are made persistent, which I hope is planned, or made a configuration option what would be the effects of them not being made persistent.

Right now if a message is lost, it seems the DB/other nodes are left in a bad state, is there any plan to have a "reaper" python object that will reap this bad data/instances....

On 8/18/11 4:54 PM, "Edward "koko" Konetzko" <konetzed@xxxxxxxxxxxxxxxxx> wrote:

On 08/16/2011 04:50 PM, Joshua Harlow wrote:
> Are there any good documentations on making openstack fault tolerant or
> exactly how it will handle failures?
>
> Like say the mq server dies, can another mq server take over. Similar
> with the database (mysql replication?)....
>
> Seems like having that kind of information for corporate users would be
> nice, at least a recommended "guide".
>
> -Josh
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

Josh

I have a very bare bones start of a doc on making parts of Nova HA.  The
problem is this document is no where near ready for release as I am
probably the only person who can understand it.  I will try to point you
in the right direction on things I have done that work pretty well.

Rabbitmq
http://www.rabbitmq.com/pacemaker.html

Right now in the version of Nova the team I am working with nothing is
marked 'persistent'. Right now in this use case if a node fails rabbitmq
moves over and all the managers reconnect with no issues but all in
flight messages are lost.  Maybe someone here can clarify on the
direction of this.  I we are using Ubuntu 10.04 and the version of
Rabbitmq in that release does not have the pacemaker scripts, I just
pulled the current package from rabbitmq.com apt repo after that the
pacemaker setup worked perfect.

MySQL
For MySQL I just did a simple setup using DRDB to replicate
/var/lib/mysql and setup corosync/pacemaker to manage all the MySQL
resources between two nodes.  Again with this situation in failover I
had no issues with clients reconnecting to the vip.

I hope this points you in the right direction, I know its not exactly
what you wanted.  Maybe next week I can clean up my documentation and
send it out to the list.

Edward Konetzko

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Follow ups

Re: High availability in openstack?
From: Michael Basnight, 2011-08-19

References

Re: High availability in openstack?
From: Edward "koko" Konetzko, 2011-08-18