← Back to team overview

maria-developers team mailing list archive

Re: a question about group commit

 

"nanyi607rao" <nanyi607rao@xxxxxxxxx> writes:

> as we know there are 3 steps in XA transaction committing
> 1, prepare step
> 2, write binary log
> 3, commit step in engines
>
> all these steps need a fsync(). Group commit strategy can make a group of transactions durable with one fsync() at step 2 and step 3, which can lead to dramatic performance enchance.
>
> But in step 1, each transaction still do its own fsync(). so why not make several transactions durable whith one fsync() in prepare step just like step 2 and 3, which I think can improve performanc further more ?

Actually, this is already implemented.

Further, in MariaDB 10.0, there is no fsync() needed in step 3. This is
because in case of a crash, XA crash recovery can repeat the step 3 using the
information saved in step 1 and 2. So in 10.0, we only need one shared fsync
in step 1 plus one shared fsync in step 2.

If you look in the innodb/xtradb code, you can see this. The prepare step
calls trx_prepare_for_mysql() in trx/trx0trx.cc. This calls trx_prepare()
which goes to trx_flush_log_if_needed_low() and calls log_write_up_to() in
log/log0log.cc. And in log_write_up_to(), you will see the group commit
logic. The transaction will wait for any previous fsync to complete; then if
it still needs the fsync(), it will fsync not just itself, but also any other
transactions that are waiting for fsync.

There is some description of the removal of fsync() in step 3 here:

    http://kristiannielsen.livejournal.com/16382.html

However, the group commit in step 1 has been in the InnoDB code for many
years, as far as I know.

Hope this helps,

 - Kristian.


Follow ups

References