← Back to team overview

maria-developers team mailing list archive

Re: e92037989f7: MDEV-21117: refine the server binlog-based recovery for semisync

 

Hi, Andrei!

On Jun 01, Andrei Elkin wrote:
> >>>> 
> >>>> It's actually turns out not to be easy. A sequence of execution events
> >>>> 
> >>>>   E1.prepare(trx), E2.prepare(trx), E1.commit(trx), *crash*
> >>>> 
> >>>> does not make E1 such as Innodb committed persistently in file.
> >>>> At recovery in E1 trx may be found in prepared state, unless
> >>>> before the crash I'd do something like Binlog CheckPoint (BCP)
> >>>> notification request and wait of it.
> >>>
> The committed state is done asynchronously (by virtue of MDEV-232 that
> implements a part of BCP).

Oh. Indeed. Sorry, I've missed that.

Still a prepare of another transaction can force this commit to disk, as
far as I understand. So what about this:

  you stop the thread on a debug sync after the first commit

  then another connection starts committing in InnoDB, it stops on a
  debug sync after the prepare - before anything is written into binlog

  then you crash.

looks like it'll create the case you need

> >> No doubt it's technically possible to wait for the flushing and
> >> crash, but still does not look easy.
> >> Consider if we defer this sort of mtr test to MDEV-18959?
> 
> Sujatha just contributed `74a13b4de2` with an mtr test that crashes
> after the asynchronous fsync is done to the partially (2 engine case)
> committed state.
...
> Aslo now that `74a13b4de2` provides with a method to crash after the
> "commmit" fsync we might also employ it for RQG.

No, this is very different. This is a crash happening where Sujatha
expects it to happen, there's no value in that now, it's needed for
regression testing, to make sure the recovery won't be broken in the
future.

But it makes no sense to keep testing now how recovery works in case of
a crash happening at exactly the place where Sujatha wanted it to happen
and after she verified that crash at that exactly line does not break
the recovery. The whole point of RQG tests is to try crashes where you
did not expect them to happen. Adding DBUG_EXECUTE_IF() doesn't help
this goal at all.

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx


References