maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #03836
Re: Semi-sync replication hangs when changing binlog filename.
-
To:
Pavel Ivanov <pivanof@xxxxxxxxxx>, Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>, Jonas Oreland <jonaso@xxxxxxxxxx>
-
From:
Joseph Glanville <jpg@xxxxxxxxx>
-
Date:
Tue, 16 Aug 2016 01:38:55 +0000
-
Accept-language:
en-AU, en-US
-
Authentication-results:
spf=none (sender IP is 10.152.252.54) smtp.mailfrom=jpg.id.au; lists.launchpad.net; dkim=none (message not signed) header.d=none;lists.launchpad.net; dmarc=none action=none header.from=jpg.id.au;
-
Cc:
"maria-discuss@xxxxxxxxxxxxxxxxxxx" <maria-discuss@xxxxxxxxxxxxxxxxxxx>
-
In-reply-to:
<CAAG=WUt55=2xy4umuLw7zmou9_kQAtxAmjh0nX3ZTwDTGF1QrA@mail.gmail.com>
-
Spamdiagnosticmetadata:
NSPM
-
Spamdiagnosticoutput:
1:99
-
Thread-index:
AQHR6V1RNgduvAUNBk6tyslXmBACfqAu8u8AgAAJDgqAABSSAIAAAvhbgAD10QCAAjwuSIAUDbeagAANSYCAA6dzGoAASWIAgACZlU0=
-
Thread-topic:
[Maria-discuss] Semi-sync replication hangs when changing binlog filename.
Hi Pavel, Kristian,
This fixes the problem for me. How do we go about getting this into a release?
Joseph.
________________________________
From: Pavel Ivanov <pivanof@xxxxxxxxxx>
Sent: Tuesday, 16 August 2016 2:28 AM
To: Kristian Nielsen; Jonas Oreland
Cc: Joseph Glanville; maria-discuss@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Maria-discuss] Semi-sync replication hangs when changing binlog filename.
Yes, Kristian. Thanks! This indeed looks like the correct patch. In
fact I see that internally we have this changed as part of the same
patch where after_flush was changed (i.e.
https://github.com/MariaDB/server/commit/4d8b346e079a27960dbe49e4d0ec4364bed8d30e).
[https://avatars1.githubusercontent.com/u/12231504?v=3&s=200]<https://github.com/MariaDB/server/commit/4d8b346e079a27960dbe49e4d0ec4364bed8d30e>
MDEV-7257: Dump Thread Enhancements · MariaDB/server@4d8b346<https://github.com/MariaDB/server/commit/4d8b346e079a27960dbe49e4d0ec4364bed8d30e>
github.com
Make the binlog dump threads not need to take LOCK_log while sending binlog events to slave. Instead, a new LOCK_binlog_end_pos is used just to coordinate tracking the current end-of-log. This is ...
I'm not sure why Jonas didn't include it in
https://jira.mariadb.org/browse/MDEV-7257.
On Mon, Aug 15, 2016 at 5:05 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
> Pavel Ivanov <pivanof@xxxxxxxxxx> writes:
>
>> binlog ending at position mariadb-bin.000004:2039896, somehow the
>> function ReplSemiSyncMaster::commitTrx() gets trx_wait_binlog_name =
>> 'mariadb-bin.000005' and trx_wait_binlog_pos = 2039896. I.e. the
>> function gets the position of the transaction to wait semi-sync ack
>> for correctly, but the file name is already the one that is current
>> after rotation. Master starts waiting for that position, but the slave
>
>> Kristian, do you have any idea what's going on? Is there an
>> inappropriate lock release/re-acquire somewhere?
>
> Hm. Actually, looking into MYSQL_BIN_LOG::trx_group_commit_leader, this
> looks suspicious:
>
> RUN_HOOK(binlog_storage, after_flush,
> (current->thd,
> current->cache_mngr->last_commit_pos_file,
> current->cache_mngr->last_commit_pos_offset, synced,
> first, last))
>
> But
>
> RUN_HOOK(binlog_storage, after_sync,
> (current->thd, log_file_name,
> current->cache_mngr->last_commit_pos_offset,
> first, last))
>
> I would have expected that `log_file_name' to be also
> current->cache_mngr->last_commit_pos_file, like in the first instance. And
> in fact, it looks like (with my limited knowledge of semi-sync) that this
> suspicious case is exactly the AFTER_SYNC which fails, while AFTER_COMMIT
> works...
>
> So maybe try the below patch?
>
> Pavel, what do you think, do you agree that this patch should be better?
>
> - Kristian.
>
>
> diff --git a/sql/log.cc b/sql/log.cc
> index 7efec98..b77a6b3 100644
> --- a/sql/log.cc
> +++ b/sql/log.cc
> @@ -7712,7 +7712,7 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
> last= current->next == NULL;
> if (!current->error &&
> RUN_HOOK(binlog_storage, after_sync,
> - (current->thd, log_file_name,
> + (current->thd, current->cache_mngr->last_commit_pos_file,
> current->cache_mngr->last_commit_pos_offset,
> first, last)))
> {
>
Follow ups
References
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Joseph Glanville, 2016-07-29
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Pavel Ivanov, 2016-07-29
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Joseph Glanville, 2016-07-29
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Justin Swanhart, 2016-07-29
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Joseph Glanville, 2016-07-29
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Pavel Ivanov, 2016-07-29
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Joseph Glanville, 2016-07-31
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Joseph Glanville, 2016-08-13
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Pavel Ivanov, 2016-08-13
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Kristian Nielsen, 2016-08-15
-
Re: Semi-sync replication hangs when changing binlog filename.
From: Pavel Ivanov, 2016-08-15