maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #07533
Re: [Commits] Rev 4281: MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail. in http://bazaar.launchpad.net/~maria-captains/maria/10.0
Jan Lindström <jan.lindstrom@xxxxxxxxxx> writes:
> I have mostly InnoDB style comments, but some questions below:
Thanks for comments!
Sorry about style violations, and thanks for pointing then out. I'll try to
fix all of it.
Answers to other questions below:
> Why you need to define these, why #include "mysql/plugin.h" file is not
> enough ?
>
> +extern "C" void thd_report_wait_for(const MYSQL_THD thd, MYSQL_THD
> other_thd);
> +extern "C" int thd_need_wait_for(const MYSQL_THD thd);
> +extern "C"
> +int thd_need_ordering_with(const MYSQL_THD thd, const MYSQL_THD other_thd);
See the discussions here:
https://lists.launchpad.net/maria-developers/msg07494.html
The conclusion with Serg was that doing a proper public API in 10.0 was too
big a change. We will do it in 10.1 (MDEV-6429). plugin.h should not be
modified in a GA version and with an API that has not been thought through
wrt. all engines. Also, the API should be a service, not in plugin.h.
> Why you need to introduce a new global structure for this, can't this be
> part
> of some already existing system structure e.g. lock_sys and why you do not
> use ut_list ?
>
> +/* Buffer to collect THDs to report waits for. */
> +struct thd_wait_reports {
> + struct thd_wait_reports *next;
> + ulint used;
> + trx_t *waitees[64];
> +};
So the idea here is to collect a bunch of trx pointers during the graph
traversal, and pass each of them to thd_report_wait_for(). I wanted to pass
them after completing the graph traversal to avoid issues if
thd_report_wait_for() needs to kill one of them; this goes to
lock_cancel_waiting_and_release() eventually which I assume might modify the
graph being traversed.
This is time-critical code (it runs under global lock_sys->mutex), so it is
important to avoid memory allocations. So I use an array allocated on the
stack, this is really short-lived data, it only lives for the duration of
lock_deadlock_check_and_resolve(). In all normal cases the single array should
be sufficient, but the code needs to handle the overflow also, of course.
So in case of overflow, I malloc further arrays and link them into the
first. I suppose another solution would be to abort in case of overflow and
roll back the transaction like in case of ctx->too_deep==true, however I don't
like to introduce such arbitrary limits when it's not necessary.
I looked into using ib_list, but this seems to always do mallocs, which I want
to avoid. I also looked at ut0vec and dyn0dyn, but they also seem to always
use malloc and thus do not seem appropriate. So I did not find anything
existing I could use, unfortunately, but then I know little of the InnoDB
code.
Do you have a suggestion for an existing data structure to use that can take
an inital buffer on the stack and malloc extra memory only if needed?
Thanks for comments!
- Kristian.