maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #06573
Re: Backup on the replication server getting affected
-
To:
ragul rangarajan <ragulrangarajan@xxxxxxxxx>
-
From:
andrei.elkin@xxxxxxxxxx
-
Date:
Fri, 02 Jun 2023 10:46:37 +0300
-
Cc:
MariaDB discuss <maria-discuss@xxxxxxxxxxxxxxxxxxx>
-
In-reply-to:
<CAExRPvEga2QDJpvMgPnjPsTdKG7kbzO+HnXEt=1u1qS=_d+Jkg@mail.gmail.com> (ragul rangarajan's message of "Thu, 1 Jun 2023 12:23:25 +0530")
-
Organization:
Home sweet home
-
User-agent:
Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)
Howdy Ragul,
> Hi Andrei,
>
> Do we have any procedures to reproduce the issue MDEV-30780?
Thanks for posting the gdb bt:s. They rule out 30780 yet not
suggesting to me enough about the hang reason. This is something new to
me and does deserve filing an MDEV ticket.
Still I'd defer that until one has confirmed the same issue is seen
on the latest 10.6. So you could run your load against the most recent
slave version that'd be at least the safest (for our time).
It might be (a slave worker) Thread 80 spinning inside
#6 0x000055de407a0a3c in log_write_up_to (lsn=<optimized out>, lsn@entry=216757233923297, flush_to_disk=flush_to_disk@entry=false, rotate_key=rotate_key@entry=false,
a goto repeat "loop".
That hopefully you can confirm any next time the hang appears back.
Could you please check whether #6 calls iteratively indeed
`group_commit_lock::release()`? (With e.g
(gdb) br thd_decrement_pending_ops thread 80
of course the number may change:-)).
All the other slave worker threads may be waiting for the 80 but I can't confirm that
until more data gets available.
Namely I need to see the output of
(gdb) thr app all get_about_worker_thread
where the latter is defined as
define get_about_worker_thread
if $_any_caller_is ("handle_rpl_parallel_thread", 50)
bt
p handle_rpl_parallel_thread::rpt
if (handle_rpl_parallel_thread::rpt->thd->rgi_slave)
p handle_rpl_parallel_thread::rpt->thd->rgi_slave
p handle_rpl_parallel_thread::rpt->thd->rgi_slave->current_gtid
p handle_rpl_parallel_thread::rpt->thd->rgi_slave->gtid_sub_id
p handle_rpl_parallel_thread::rpt->thd->rgi_slave->worker_error
end
end
end
> Unable to reproduce the issue locally but it occurs at random.
to require some more patience from us.
I belive we can resolve it while you're helping so generously!
Cheers,
Andrei
>
> Regards,
> Ragul
>
> On Mon, May 29, 2023 at 7:06 PM ragul rangarajan <ragulrangarajan@xxxxxxxxx> wrote:
>
> Thanks Andrei,
>
> Hope my issue is more related to the issue MDEV-30780 optimistic parallel slave hangs after hit an error
> Trying to reproduce with a minimal database.
References