maria-discuss team mailing list archive
Mailing list archive
I'd like to evaluate mariadb-galera on some test databases. Our load is
at least 90% writes coming from a datastream and I would try to set up a
real-world situation and test with real data.
Most of the work load is "insert ... on duplicate key update"
mass-inserts (actually updates) of about 4MB/10000 rows each. 99% of
these run onto 3-4 tables (each around 10GB/7 million rows of size).
Total database size is around 800 GB.
We have a traditional statement-based master-slave replication, but all
these mass-inserts run over a proxy with SET SQL_LOG_BIN=0 in parallel
on all systems - no way to process this amount of data single-threaded
on the slaves. I was hoping galera would make this easier and make our
I have some servers I could use for galera as a first step. I wanted to
do this for 4 months now but never got into it, I read your request to
test the RC and think I should start now hoping to be able to give some
After reading through the documentation and some blogs I have some
doubts, but my idea was to do the following:
I'd clone the database of an existing slave to a new machine with
mariadb-galera (name the host "testgalera1" and rsync the database or
use mysqldump). I'd then set up this one as a "normal" statement based
slave connected to one of our masters? Afterwards, the next host
"testgalera2" would have to be added to the galera-cluster? And after
that, I'd set up one of our older database servers and add that to the
cluster, too. My idea was to take an older, slower system to see how
much this slowes the cluster down and get an idea how much headroom I'd
have when I leave those old machines out (the old machines have only
64GB of RAM and slower CPUs).
I'd then modify my proxy for all the incoming inserts and add a
writer-thread to one of the hosts in the cluster, that data should be
replicated by galera to all other galera hosts. All other data should be
coming in via the "traditional" replication thread of testgalera1. This
should give me a 95-99% real world write situation on the cluster.
Adding a host to a running cluster would mean cloning the whole database
(of course), but this also means the host I copy from would have to
block write access to its tables and buffer all modifications in memory
(at least that's how I read the documentation) during the copy process.
Looking at the database size and amount of inserts, I am afraid adding a
host to the cluster without taking everything down would be impossible -
or even if it works fine now it could cause trouble once the database
gets just a little bit larger or my load increases a bit.
before I start with this - is galera an insane idea regarding my load?
Or is it worth a try?
some numbers of some of the old databases:
innodb_buffer_pool_size between 60-120 GB depending on server memory
db5: Uptime: 8052248 Threads: 44 Questions: 1413881794 Slow queries:
160049 Opens: 683779 Flush tables: 2 Open tables: 1024 Queries per
second avg: 175.588
db6: Uptime: 8070552 Threads: 6 Questions: 873413218 Slow queries:
3208 Opens: 336584 Flush tables: 1 Open tables: 578 Queries per
second avg: 108.222
db7: Uptime: 4175032 Threads: 28 Questions: 639627677 Slow queries:
139337 Opens: 522673 Flush tables: 1 Open tables: 2016 Queries per
second avg: 153.203
db7# mysql -e 'show global status like "%com\_%" ' |grep -v '\s0$'