← Back to team overview

maria-developers team mailing list archive

Re: Questions re MDEV-4736 and MDEV-4739 (was Re: Spider's installation sql file)


Hi, Sergey!

On Oct 04, Sergey Vojtovich wrote:
> Hi Kentoku,
> I just reviewed one of your revisions, specifically
> bzr diff -c3829 lp:~kentokushiba/maria/10.0.4-spider-3.0/
> I believe things are a bit more complex: 2PC protocol doesn't seem to permit
> cohorts to fail during commit phase:
> http://en.wikipedia.org/wiki/Two-phase_commit_protocol#Commit_phase
> <quot>
> If the coordinator received an agreement message from all cohorts during the
> commit-request phase:
>   1. The coordinator sends a commit message to all the cohorts.
>   2. Each cohort completes the operation, and releases all the locks and
>      resources held during the transaction.
>   3. Each cohort sends an acknowledgment to the coordinator.
>   4. The coordinator completes the transaction when all acknowledgments have
>      been received.
> </quot>
> I read the above as: the only problem coordinator may experience is missing
> acknowledgement. What shall coordinator do if some cohorts acknowledged
> commit, but some did not? Probably spider should detect it earlier?
> Sergei, what's your opinion?

Let me see, if I understood the problem correctly.
The crash happens because spider uses my_error() in the 2pc commit step,
and the error status is lost up the stack, so Diagnostic_area::ok()
fires an asserts on redefining the statement status. Is that right?

The server should know that the error has happened on commit and should
not trigger an assert, it should report the error to the user.
The error at the commit step should normally never happen, it means
inconsistent data, because some participants might've already committed
the transaction and they cannot roll it back anymore. Still, the commit
method *might* return an error status and we shouldn't ignore it.
Hardware failures are a good example of what can cause a commit error.

Anyway, Spider should be fixed to not error out in 2pc commits, because
such a commit means inconsistent data, it's a bad error, it breaks ACID.
An engine is expected to check all preconditions during prepare, and if
prepare succeeds, it is basically a guarantee that the commit will
succeed, it is not allowed to fail anymore.


Follow ups