maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #03573
Re: Architecture review of MWL#116 "Efficient group commit for binary log"
Sergei Golubchik <serg@xxxxxxxxxxxx> writes:
>> So now the algorithm is something like this:
> Where in this algorithm you call ht->commit_ordered() ?
Oops, I forgot, sorry!
It should be at the start of "for thd2 in <queue>" (before the wakeup).
>> thd->ready= false
>> lock(LOCK_prepare_ordered)
>> old_queue= group_commit_queue
>> thd->next= old_queue
>> group_commit_queue= thd
>> ht->prepare_ordered()
>> unlock(LOCK_prepare_ordered)
>>
>> if (old_queue == NULL) // leader?
>> lock(LOCK_group_commit)
>>
>> lock(LOCK_prepare_ordered)
>> queue= reverse(group_commit_queue)
>> group_commit_queue= NULL
>> unlock(LOCK_prepare_ordered)
>>
>> group_log_xid(queue)
>>
>> lock(LOCK_commit_ordered) // but see below
>> unlock(LOCK_group_commit)
>> for thd2 in <queue>
Here: ht->commit_ordered(thd2)
>> lock(thd2->LOCK_wakeup)
>> thd2->ready= true
>> signal(thd2->COND_wakeup)
>> unlock(thd2->LOCK_wakeup)
>> unlock(LOCK_commit_ordered)
>> else
>> lock (thd->LOCK_wakeup)
>> while (!thd->ready)
>> wait(COND_wakeup, LOCK_wakeup)
>> unlock (thd->LOCK_wakeup)
>>
>> cookie= xid_log_after()
>
>> On the other hand, the algorithm I suggested earlier for START
>> TRANSACTION WITH CONSISTENT SNAPSHOT used the LOCK_commit_ordered, and
>> there might be other uses...
> START TRANSACTION WITH CONSISTENT SNAPSHOT is a good reason to keep the
> mutex.
Yes, probably.
>> But I choose to do it earlier, as soon as the transaction is put in
>> the queue and commit order thereby defined.
>>
>> There can be quite a "long" time interval between these two events:
>> the time it takes for the previous group_log_xid() (eg. an fsync()),
>> plus sometimes one wants to add extra sleeps in group commit to group
>> more transactions together.
>
> No.
> The long interval is *inside* the group_log_xid(), while you call
> prepare_ordered() *before* it.
Right, that is what I meant. One group of transactions execute the long
interval inside group_log_xid(). While this happens, new transactions that
want to commit queue up waiting for the first group to finish. The first
waiting transaction (the new leader) blocks on the LOCK_group_commit, any
other waits for the new leader to wake them up. So I want to call
prepare_ordered() before blocking on LOCK_group_commit, as that mutex is held
for the duration of group_log_xid()
> But anyway, the LOCK_prepare_ordered mutex is not going to be contented,
> so removing it by using a lock-free queue (that's what this second
> approach is about) will not bring any noticeable benefits.
Very true.
> It's reasonable to say that if an engine does not implement
> commit_ordered() then it needs to take care of its own recovery and
> fsync both in prepare and commit.
Yes, sounds reasonable.
Thanks!
- Kristian.
References