← Back to team overview

maria-developers team mailing list archive

Re: MDEV-232: Remove one fsync() from commit phase

 

Sergei Golubchik <serg@xxxxxxxxxxxx> writes:

> On Aug 17, Kristian Nielsen wrote:

>> Do you want to see the changes for when I implement async
>> commit_checkpoint_request() in InnoDB/XtraDB, before I push it to 10.0?

> Yes, but it can be a separate patch.

Ok, will do.

>>   "Controls the durability/speed trade-off for commits."
>>   " Set to 0 (write and flush redo log to disk only once per second),"
>>   " 1 (flush to disk at each commit),"
>>   " 2 (write to kernel at commit but flush to disk only once per second)"
>
> I find "write to kernel" very confusing.
> May be "write to log" ?

Ok.

>>   " or 3 (flush to disk twice per commit, usually redundant)."
>
> I'd like "flush at prepare and at commit, usually redundant",
> because otherwise one may ask why to flush twice at commit. It's easy to
> overlook the prepare step and miss the difference between "twice PER
> commit" and "twice AT commit" (especially as you write "AT commit"
> above, for =1).

Ok. Thanks for the suggestions!

>> As you say, this should be easy. I need to store a list of cookies and their
>> associated LSN (not just a single cookie), but this is infrequently executed
>> so just a simple linked list with malloc/free of elements will work fine.
>
> I thought you don't need a list. But simply
>
>   log_sys->cookie= cookie;
>
> in innobase_checkpoint_request() and
>
>   if (log_sys->cookie)
>   {
>     ha_commit_checkpoint_notify(log_sys->cookie);
>     log_sys->cookie= 0;
>   }
>
> in log_buffer_sync_in_background().
>
> There can never be two cookies active, right?

Yes, there can. Why not?

Checkpointing is now asynchronous, there is nothing that prevents a new log
rotate and a new checkpoint request to happen before the old one finished. In
fact, this case is tested in binlog_checkpoint.test.

Of course in practise, this is unlikely to happen. It requires that the time
between binlog rotations is smaller than the time to complete a checkpoint.

>> We cannot call mark_xid_done() here, as we are holding LOCK_log - and
>> mark_xid_done() also takes (sometimes) LOCK_log.
>
> Ok, so here you only decrement the counter, and mark_xid_done()
> additionally might log a checkpoint event. It looks like it might sense
> to extend mark_xid_done() with a second argument
>
> -TC_LOG_BINLOG::mark_xid_done(ulong cookie)
> +TC_LOG_BINLOG::mark_xid_done(ulong cookie, bool write_checkpoint)
>
> then you'll be able to use it here and avoid code duplication.

Ok, done.

Thanks!

 - Kristian.


Follow ups

References