maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #03562
Re: Architecture review of MWL#132 Transaction coordinator plugin
Hi, Kristian!
Now, WL#132 - Transaction coordinator plugin
> ============= High-Level Specification
...
> In current MariaDB, we have two different TC implementations (as well
> as a "dummy" empty implementation that I do not know if is used).
The code in mysqld.cc is
tc_log= (total_ha_2pc > 1 ? (opt_bin_log ?
(TC_LOG *) &mysql_bin_log :
(TC_LOG *) &tc_log_mmap) :
(TC_LOG *) &tc_log_dummy);
so, tc_log_dummy is used when there's at most one xa-capable engine.
But MySQL does not use 2pc for a transaction unless it has at least two
xa-capable participants. In other words, tc_log_dummy is never used.
> Binary log
> ----------
>
> The binary log implements also a "fake" storage engine, mainly to hook
> into the commit (and prepare) phase of transaction processing. This is
> mainly used for statements in non-transactional engines, which are
> "committed" and written to the binary log outside of the TC and
> log_xid() framework.
No, this is used to make the number of xa-capable transaction
participants more than one and to force MySQL to use 2PC.
> TC interface subclasses
> -----------------------
>
> The MWL#116 has two different algorithms for handling commit order and
> invoking prepare_ordered() and commit_ordered() handler methods:
>
> - One used with TC_MMAP, which needs no correspondance between
> engines and TC. This uses the existing log_xid() interface.
>
> - One used with the binary log TC, which ensures same commit order in
> engines and binary log, and which uses a new single-threaded
> group_log_xid() TC interface to efficiently do group commit.
>
> In the prototype patch for MWL#116, these two methods are mixed with
> each other in the function ha_commit_trans(), and the logic is quite
> complex. Using the log_and_order() TC generalisation provides a nice
> cleanup of this.
>
> We implement two subclasses of the TC interface:
>
> - One class TC_LOG_unordered for the method used with TC_MMAP. This
> implements the old log_xid() interface.
>
> - One class TC_LOG_group_commit for the method used for the binary
> log. This implements the new group_log_xid() interface.
>
> Each subclass implements the corresponding algorithm for invoking
> prepare_ordered() and commit_ordered(), using the same mechanisms as
> in MWL#116, but implemented in a cleaner way. The ha_commit_trans()
> function then has no details about prepare_ordered() or
> commit_ordered(), it just calls into tc_log->log_and_order(), which
> handles the necessary details.
>
> Thus a simple TC plugin similar to the binary log or TC_MMAP can
> implement one of the simple interfaces log_xid() or group_log_xid(),
> without having to worry about prepare_ordered() and commit_ordered().
> But a plugin like Galera that needs to do more can implement the more
> general interface.
I still see no real value in keeping or supporting log_xid() interface.
I think we can only implement one interface - group_log_xid() - and
that's enough.
> ============= Low-Level Design
...
> log_and_order()
> Requests a decision to commit (non-zero return) or rollback (zero
> return) of the transaction. At this point, the transaction has
> been successfully prepared in all engines.
>
> The method must call run_prepare_ordered(), in a way so that calls
> in different threads happen in the order that the transactions are
> committed. This call must be protected by the global
> LOCK_prepare_ordered mutex.
>
> The method must then call run_commit_ordered(), protected by
> LOCK_commit_ordered, again so that different threads are called in
> the order that transactions are committed.
>
> The idea with prepare_ordered() is to call it as early as possible
> after commit order has been decided, for example to release locks
> early. In particular, a transaction can still be rolled back after
> prepare_ordered() (for example in case of a crash). In contrast,
> commit_ordered() may only be called after the transaction is
> durably committed in the TC.
>
> If need_prepare_ordered or need_commit_ordered is passed as FALSE,
> then the corresponding call need not be done. It is safe to do it
> anyway, however omitting it avoids the need to take a global
> mutex.
Why would this ever be needed ?
(I mean need_prepare_ordered or need_commit_ordered being FALSE)
...
> A TC based on this interface overrides group_log_xid() and
> xid_log_after() instead of log_and_order(), and again does not need to
> deal with any {prepare,commit}_ordered().
Why do you need xid_log_after here ?
General comment:
Wouldn't it be simpler to create only group_log_xid() interface, no
log_and_order() or log_xid() ? The tc plugin gets the list in
group_log_xid() - it can reorder the list any way it wants, call
prepare_ordered() and commit_ordered() as needed and so on.
In this interpretation, group_log_xid() can meet all the use cases.
And there's no need to create a multitude of methods that one
needs to get familiar with before implementing a TC plugin.
Regards,
Sergei
P.S. Minor detail - there could be helper functions like
iterate_the_list_and_call_prepare_ordered(), that the plugin can use.
Follow ups