maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #00729
Re: testing Galera
Hi everybody,
I installed Galera on 2 servers, imported a database dump (took almost 2
days to import a 100GB gzipped sql-dump ...) and set up one of the two
servers as a replication slave of our main database.
Now as I try to understand all the new variables and want to get an idea
of do's and dont's some questions pop up:
While the traditional replication slave on the first galera-server is
still catching up (still 200000 seconds behind...) I had our main
data-proxy starting to write to that server.
Most of the data coming into our databases is not going its way trough
the replication but via our self-written proxy application. This
application connects to all databases, sets SQL_LOG_BIN=0 and then
starts updating just a few tables (up to 15-20GB of pure sql statements
per hour depending on the time of the day). This works fine, the server
was processing the statements no slower than the other non-galera servers.
I couldn't see the updates on the second galera-server and figured that
was because of the SQL_LOG_BIN=0. Now things got complicated. How to
resolve this? I gave it a try, hacked the proxy not to set SQL_LOG_BIN=0
for that specific server and started it again, but this made the second
server crash almost immidiatly with HA_ERR_KEY_NOT_FOUND - actually I
was expecting something like that to happen as the tables were out of
sync now.
Is there something like slave_skip_counter, aka "I Know what I do, skip
that update"? I think I have to take a new snapshot to get the second
node up and consistent again. It tried to to that automatically but
stopped because of the hanging rsync-daemon problem. I'll let it catch
up with the replication lag first and then start that again...
If I take a node down or just restart mariadb after changing variables,
it pulls the changes it missed (IST) from another server, right? These
are saved in the galera-cache on all servers? How do I figure out what a
reasonable size for the gcache would be for me? The default size of
128MB is way to small for me if I want to take a node down for a few
hours to do backups/upgrades/whatever, but do I need 5, 10 or 50 GBs?
Some information on number of bytes written to the gcache I could look
at? Or is "wsrep_replicated_bytes" the number to look at?
Finding all documentation and understanding the details is a little bit
difficult and time-consuming, but I'm making my way...
btw: is maria-developers a better place to post this as I am not writing
on production software but alpha releases?
thanks
Jan
Am 28.01.2013 17:18, schrieb Jan Kirchhoff:
> Alex,
>
> regarding the transaction rates: we currently have a peak at around
> 400-500/sec at certain times of the day. We'll see how that works
>
> I just found there are no mariadb-galera-debs around for ubuntu 12.10.
> Can we expect them soon? Otherwise I'd have to reinstall the test
> servers with 12.04 - couldn't get the dependency problems resolved when
> trying to get the 12.04-debs into 12.10 and I don't want to compile
> myself :-(
>
> Jan
>
> Am 25.01.2013 10:24, schrieb Alex Yurchenko:
>> Hi Jan,
>>
>> 1) the load and the database size you described is nothing special to
>> Galera, we have users running it on bigger databases and with higher
>> transaction rates, Specifically those bulk inserts are perfectly fine.
>>
>> 2) notice that to get most of Galera (e.g. parallel applying) you have
>> to use ROW binlog format at least on all Galera nodes. Using
>> statement-based replication from master should be fine, but inside
>> Galera cluster it should be ROW.
>>
>> 3) your plan looks fine, just don't forget to set log_slave_updates when
>> mixing Galera and native MySQL replication.
>>
>> 4) the latest MariaDB-Galera has support for xtrabackup for new node
>> provisioning.
>>
>> Regards,
>> Alex
>>
>> On 2013-01-24 00:09, Jan Kirchhoff wrote:
>>> Hi,
>>>
>>> I'd like to evaluate mariadb-galera on some test databases. Our load
>>> is at least 90% writes coming from a datastream and I would try to set
>>> up a real-world situation and test with real data.
>>> Most of the work load is "insert ... on duplicate key update"
>>> mass-inserts (actually updates) of about 4MB/10000 rows each. 99% of
>>> these run onto 3-4 tables (each around 10GB/7 million rows of size).
>>> Total database size is around 800 GB.
>>> We have a traditional statement-based master-slave replication, but
>>> all these mass-inserts run over a proxy with SET SQL_LOG_BIN=0 in
>>> parallel on all systems - no way to process this amount of data
>>> single-threaded on the slaves. I was hoping galera would make this
>>> easier and make our proxies obsolete.
>>>
>>> I have some servers I could use for galera as a first step. I wanted
>>> to do this for 4 months now but never got into it, I read your request
>>> to test the RC and think I should start now hoping to be able to give
>>> some valuable feedback:
>>>
>>> After reading through the documentation and some blogs I have some
>>> doubts, but my idea was to do the following:
>>> I'd clone the database of an existing slave to a new machine with
>>> mariadb-galera (name the host "testgalera1" and rsync the database or
>>> use mysqldump). I'd then set up this one as a "normal" statement based
>>> slave connected to one of our masters? Afterwards, the next host
>>> "testgalera2" would have to be added to the galera-cluster? And after
>>> that, I'd set up one of our older database servers and add that to the
>>> cluster, too. My idea was to take an older, slower system to see how
>>> much this slowes the cluster down and get an idea how much headroom
>>> I'd have when I leave those old machines out (the old machines have
>>> only 64GB of RAM and slower CPUs).
>>>
>>> I'd then modify my proxy for all the incoming inserts and add a
>>> writer-thread to one of the hosts in the cluster, that data should be
>>> replicated by galera to all other galera hosts. All other data should
>>> be coming in via the "traditional" replication thread of testgalera1.
>>> This should give me a 95-99% real world write situation on the
>>> cluster.
>>>
>>> Adding a host to a running cluster would mean cloning the whole
>>> database (of course), but this also means the host I copy from would
>>> have to block write access to its tables and buffer all modifications
>>> in memory (at least that's how I read the documentation) during the
>>> copy process. Looking at the database size and amount of inserts, I am
>>> afraid adding a host to the cluster without taking everything down
>>> would be impossible - or even if it works fine now it could cause
>>> trouble once the database gets just a little bit larger or my load
>>> increases a bit.
>>>
>>> before I start with this - is galera an insane idea regarding my
>>> load? Or is it worth a try?
>>>
>>> Jan
>>>
>>> some numbers of some of the old databases:
>>>
>>> innodb_buffer_pool_size between 60-120 GB depending on server memory
>>> configuration.
>>>
>>> db5: Uptime: 8052248 Threads: 44 Questions: 1413881794 Slow
>>> queries: 160049 Opens: 683779 Flush tables: 2 Open tables: 1024
>>> Queries per second avg: 175.588
>>>
>>> db6: Uptime: 8070552 Threads: 6 Questions: 873413218 Slow queries:
>>> 3208 Opens: 336584 Flush tables: 1 Open tables: 578 Queries per
>>> second avg: 108.222
>>>
>>> db7: Uptime: 4175032 Threads: 28 Questions: 639627677 Slow
>>> queries: 139337 Opens: 522673 Flush tables: 1 Open tables: 2016
>>> Queries per second avg: 153.203
>>>
>>> db7# mysql -e 'show global status like "%com\_%" ' |grep -v '\s0$'
>>> Variable_name Value
>>> Com_admin_commands 289127
>>> Com_alter_table 6269
>>> Com_analyze 725
>>> Com_begin 3712141
>>> Com_change_db 80112
>>> Com_commit 3999309
>>> Com_create_db 1
>>> Com_create_table 253849
>>> Com_create_trigger 10
>>> Com_create_view 68
>>> Com_delete 12365037
>>> Com_delete_multi 7
>>> Com_drop_table 310771
>>> Com_drop_trigger 10
>>> Com_drop_view 65
>>> Com_empty_query 509
>>> Com_flush 48
>>> Com_grant 36
>>> Com_insert 314280963
>>> Com_insert_select 613565
>>> Com_kill 214
>>> Com_lock_tables 85
>>> Com_optimize 1
>>> Com_purge 2320
>>> Com_rename_table 11
>>> Com_repair 1
>>> Com_replace 14240825
>>> Com_replace_select 29920
>>> Com_revoke 8
>>> Com_rollback 13
>>> Com_select 178540996
>>> Com_set_option 6886915
>>> Com_show_binlogs 69960
>>> Com_show_charsets 1542
>>> Com_show_collations 1544
>>> Com_show_create_db 28
>>> Com_show_create_func 195
>>> Com_show_create_table 53282
>>> Com_show_create_trigger 1
>>> Com_show_databases 43584
>>> Com_show_events 3
>>> Com_show_fields 131874
>>> Com_show_function_status 45
>>> Com_show_grants 335
>>> Com_show_keys 2582
>>> Com_show_master_status 142
>>> Com_show_plugins 7224
>>> Com_show_procedure_status 45
>>> Com_show_processlist 116918
>>> Com_show_slave_status 83193
>>> Com_show_status 155732
>>> Com_show_storage_engines 25
>>> Com_show_table_status 7439
>>> Com_show_tables 1973739
>>> Com_show_triggers 538
>>> Com_show_variables 85034
>>> Com_show_warnings 766
>>> Com_stmt_close 363
>>> Com_stmt_execute 363
>>> Com_stmt_prepare 363
>>> Com_truncate 452
>>> Com_unlock_tables 79
>>> Com_update 34496744
>>> Com_update_multi 408212
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~maria-discuss
>>> Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~maria-discuss
>>> More help : https://help.launchpad.net/ListHelp
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help : https://help.launchpad.net/ListHelp
>
Follow ups
References