maria-developers team mailing list archive
Mailing list archive
Re: Transactions behind a failed transaction could commit in parallel replication
"nanyi607rao" <nanyi607rao@xxxxxxxxx> writes:
> if (unlikely(entry->stop_on_error_sub_id <= rgi->wait_commit_sub_id))
> skip_event_group= true;
> this codes can tell latter transactions to skip but can't tell them rollback. because if a transaction started commiting before a former transaction failed (such as Lock timeout for unknown reason), the commiting transaction will not be affectd by stop_on_error_sub_id.
> Then the failed transaction should wakeup latter commiting transactions and tell them to rollback, unfortunately it won't. codes like
> if (!rgi->is_error && !skip_event_group)
> err= rpt_handle_event(events, rpt);
> err= thd->wait_for_prior_commit();
> ... ...
> finish_event_group(thd, err, event_gtid_sub_id, entry, rgi);
> if the failed transaction didn't fail at end event, err's value would come from wait_for_prior_commit, the err would be 0 if its former transaction has successed, then the failed transaction would tell latter transactions ok to commit in finish_event_group.
Ah, I see, thanks for the detailed analysis!
Right, so I will look into this and get it fixed.
Maybe all that is needed is to remember the error code when rgi->is_error is
set, and use the real error code to pass to finish_event_group() - so that it
will pass the error to wakeup_subsequent_commits() and the following
transactions will roll back.
This error handling code is quite tricky, I hope we can get it right. It is
very helpful to get this kind of report, thanks again!