← Back to team overview

openstack team mailing list archive

Re: High availability in openstack?

 

On 08/16/2011 04:50 PM, Joshua Harlow wrote:
Are there any good documentations on making openstack fault tolerant or
exactly how it will handle failures?

Like say the mq server dies, can another mq server take over. Similar
with the database (mysql replication?)....

Seems like having that kind of information for corporate users would be
nice, at least a recommended “guide”.

-Josh



_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Josh

I have a very bare bones start of a doc on making parts of Nova HA. The problem is this document is no where near ready for release as I am probably the only person who can understand it. I will try to point you in the right direction on things I have done that work pretty well.

Rabbitmq
http://www.rabbitmq.com/pacemaker.html

Right now in the version of Nova the team I am working with nothing is marked 'persistent'. Right now in this use case if a node fails rabbitmq moves over and all the managers reconnect with no issues but all in flight messages are lost. Maybe someone here can clarify on the direction of this. I we are using Ubuntu 10.04 and the version of Rabbitmq in that release does not have the pacemaker scripts, I just pulled the current package from rabbitmq.com apt repo after that the pacemaker setup worked perfect.

MySQL
For MySQL I just did a simple setup using DRDB to replicate /var/lib/mysql and setup corosync/pacemaker to manage all the MySQL resources between two nodes. Again with this situation in failover I had no issues with clients reconnecting to the vip.

I hope this points you in the right direction, I know its not exactly what you wanted. Maybe next week I can clean up my documentation and send it out to the list.

Edward Konetzko


Follow ups

References