maria-developers team mailing list archive

Thread
Date

Re: MDEV-232: Remove one fsync() from commit phase

To: Sergei Golubchik <serg@xxxxxxxxxxxx>
From: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Date: Mon, 03 Sep 2012 14:40:10 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20120825054003.GA30257@meddwl.fritz.box> (Sergei Golubchik's message of "Sat, 25 Aug 2012 07:40:03 +0200")
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux)

Sergei Golubchik <serg@xxxxxxxxxxxx> writes:

> On Aug 17, Kristian Nielsen wrote:

>> Do you want to see the changes for when I implement async
>> commit_checkpoint_request() in InnoDB/XtraDB, before I push it to 10.0?

> Yes, but it can be a separate patch.

Ok, will do.

>>   "Controls the durability/speed trade-off for commits."
>>   " Set to 0 (write and flush redo log to disk only once per second),"
>>   " 1 (flush to disk at each commit),"
>>   " 2 (write to kernel at commit but flush to disk only once per second)"
>
> I find "write to kernel" very confusing.
> May be "write to log" ?

Ok.

>>   " or 3 (flush to disk twice per commit, usually redundant)."
>
> I'd like "flush at prepare and at commit, usually redundant",
> because otherwise one may ask why to flush twice at commit. It's easy to
> overlook the prepare step and miss the difference between "twice PER
> commit" and "twice AT commit" (especially as you write "AT commit"
> above, for =1).

Ok. Thanks for the suggestions!

>> As you say, this should be easy. I need to store a list of cookies and their
>> associated LSN (not just a single cookie), but this is infrequently executed
>> so just a simple linked list with malloc/free of elements will work fine.
>
> I thought you don't need a list. But simply
>
>   log_sys->cookie= cookie;
>
> in innobase_checkpoint_request() and
>
>   if (log_sys->cookie)
>   {
>     ha_commit_checkpoint_notify(log_sys->cookie);
>     log_sys->cookie= 0;
>   }
>
> in log_buffer_sync_in_background().
>
> There can never be two cookies active, right?

Yes, there can. Why not?

Checkpointing is now asynchronous, there is nothing that prevents a new log
rotate and a new checkpoint request to happen before the old one finished. In
fact, this case is tested in binlog_checkpoint.test.

Of course in practise, this is unlikely to happen. It requires that the time
between binlog rotations is smaller than the time to complete a checkpoint.

>> We cannot call mark_xid_done() here, as we are holding LOCK_log - and
>> mark_xid_done() also takes (sometimes) LOCK_log.
>
> Ok, so here you only decrement the counter, and mark_xid_done()
> additionally might log a checkpoint event. It looks like it might sense
> to extend mark_xid_done() with a second argument
>
> -TC_LOG_BINLOG::mark_xid_done(ulong cookie)
> +TC_LOG_BINLOG::mark_xid_done(ulong cookie, bool write_checkpoint)
>
> then you'll be able to use it here and avoid code duplication.

Ok, done.

Thanks!

 - Kristian.

Follow ups

Aria Memory Only Tables
From: md@xxxxxxxxxxxxx, 2012-09-08
IDE debugging Mariadb linux build
From: md@xxxxxxxxxxxxx, 2012-09-07

References

Re: [knielsen@xxxxxxxxxxxxxxx] [Commits] Rev 3459: MDEV-232: Remove one fsync() from commit phase. in http://bazaar.launchpad.net/~maria-captains/maria/5.5
From: Kristian Nielsen, 2012-08-17
Re: MDEV-232: Remove one fsync() from commit phase
From: Sergei Golubchik, 2012-08-25