maria-discuss team mailing list archive

Thread
Date
Newbie Question - Multi-Master vs Cluster

To: maria-discuss@xxxxxxxxxxxxxxxxxxx
From: Stuart Elvish <stuart.elvish@xxxxxxxxx>
Date: Tue, 6 Jan 2015 14:53:53 +0700
Hi all,

Up until now we have only directly managed stand alone MySQL instances. For
clusters (MySQL) we have used a contractor to build and administrate the
clusters. Our current product development requirements and good business
practice (high availability of critical databases) now dictate that we move
to a multi-server setup.

Our goal is to setup at least two servers which will have the same data at
all times and which are at separate sites so if the link between the sites
goes down, the cell of servers at that site can keep operating. Our
database stores text fields and blobs of audio recordings. We read data
(SELECT with a WHERE) for 90-95% of our queries, some tables can be
populated with hundreds of thousand of rows but normally far fewer. The
software we are running which interacts with the database is smart enough
to figure out if one of the database servers it is trying to reach is not
responding and will hunt through a list of IP's until it successfully
connects.

I have a couple of questions as this will be our first cluster. (I don't
think multi-master will meet our requirements based on the examples
available online so I have chosen MariaDB 10 with Galera.)

How often is complex management (manual intervention) required such as
adding a node back into the cluster after communications between the nodes
fails and then is restored? Is it normal, say 90% of the time, for syncing
to re-establish automatically?

What type of monitoring tools are advised? (Or is it as simple as
automatically parsing logs to find errors?)

The documented recommendation is 3 servers; we don't have the hardware at
the moment so is it ok to start with 2 and add an additional database
server at a later time?

Alternatively, we could use 2 dedicated servers and 1 virtual server but
the documentation states that the cluster only works as quickly as the
slowest node. Is that only if you are running queries against the slowest
node? Or if you are writing to the slowest node? I am assuming we can run
the arbiter (garbd) on one of the existing data nodes.

I am assuming we can use the SSL options in the configuration file to
secure communication between the servers so if it was sniffed it would not
be possible to decode the data. Does anyone have any comments on if this
affects reliability?

Thank you for any input you have on these questions. Please feel free to
ask if you want clarification on anything.