← Back to team overview

openstack team mailing list archive

Re: High availability in openstack?

 

Thanks,

It was along the lines of what I was thinking.

If messages are made persistent, which I hope is planned, or made a configuration option what would be the effects of them not being made persistent.

Right now if a message is lost, it seems the DB/other nodes are left in a bad state, is there any plan to have a "reaper" python object that will reap this bad data/instances....

On 8/18/11 4:54 PM, "Edward "koko" Konetzko" <konetzed@xxxxxxxxxxxxxxxxx> wrote:

On 08/16/2011 04:50 PM, Joshua Harlow wrote:
> Are there any good documentations on making openstack fault tolerant or
> exactly how it will handle failures?
>
> Like say the mq server dies, can another mq server take over. Similar
> with the database (mysql replication?)....
>
> Seems like having that kind of information for corporate users would be
> nice, at least a recommended "guide".
>
> -Josh
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

Josh

I have a very bare bones start of a doc on making parts of Nova HA.  The
problem is this document is no where near ready for release as I am
probably the only person who can understand it.  I will try to point you
in the right direction on things I have done that work pretty well.

Rabbitmq
http://www.rabbitmq.com/pacemaker.html

Right now in the version of Nova the team I am working with nothing is
marked 'persistent'. Right now in this use case if a node fails rabbitmq
moves over and all the managers reconnect with no issues but all in
flight messages are lost.  Maybe someone here can clarify on the
direction of this.  I we are using Ubuntu 10.04 and the version of
Rabbitmq in that release does not have the pacemaker scripts, I just
pulled the current package from rabbitmq.com apt repo after that the
pacemaker setup worked perfect.

MySQL
For MySQL I just did a simple setup using DRDB to replicate
/var/lib/mysql and setup corosync/pacemaker to manage all the MySQL
resources between two nodes.  Again with this situation in failover I
had no issues with clients reconnecting to the vip.

I hope this points you in the right direction, I know its not exactly
what you wanted.  Maybe next week I can clean up my documentation and
send it out to the list.

Edward Konetzko

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Follow ups

References