← Back to team overview

maria-developers team mailing list archive

Re: Implementing new "group commit" API in PBXT?

 

Paul McCullagh <paul.mccullagh@xxxxxxxxxxxxx> writes:

> The easiest way to do this would be to add a parameter to
> xn_end_xact() that indicates that the log should not be written or
> flushed.

Ok, I gave it a shot, but I had some problems due to not knowing the PBXT code
sufficiently ...

> In xn_end_xact(), the last parameter to the call to xt_xlog_log_data()
> determines what should happen:
>
> #define XT_XLOG_NO_WRITE_NO_FLUSH	0
> #define XT_XLOG_WRITE_AND_FLUSH		1
> #define XT_XLOG_WRITE_AND_NO_FLUSH	2
>
> Without write or flush, this is a very fast operation. But the
> transaction is still committed and ordered, it is just not durable.

I notice that xs_end_xact() does a number of things. I am wondering if all of
these should be in the "fast" part in commit_ordered(), or if some should be
done in the "slow" part along with the log flush?

In particular this, flushing the data log (is this flush to disk?):

    if (!thread->st_dlog_buf.dlb_flush_log(TRUE, thread)) {
            ok = FALSE;
            status = XT_LOG_ENT_ABORT;
    }

and this, at the end concerning the "sweeper":

    if (db->db_sw_faster)
            xt_wakeup_sweeper(db);

    /* Don't get too far ahead of the sweeper! */
    if (writer) {
        ...

Can you help suggest if these should be done in the "fast" part, or in the
"slow" part?

Also, this statement definitely needs to be postponed to the "slow" part I
guess:

    thread->st_xact_data = NULL;

> Then when actual commit is called, we check the current log flush
> position against the flush position we need. If it is passed our
> position then this is a NOP.

I think I can do this with a condition like this:

    if (xt_comp_log_pos(self->commit_fastpart_log_id, self->commit_fastpart_log_offset, xl_flush_log_id, xl_flush_log_offset) <= 0)

But I am wondering if I need to take any locks around reading xl_flush_log_id
and xl_flush_log_offset? Or can one argue that a dirty read could be ok (as
long as it's atomic) as the values are probably monotonic?

> If not, then we need to call xlog_append() with no data. This will do
> a group commit on the log.

Is it safe to call xlog_append() with no data even if the log has been flushed
past the current position already? (else some locking seems definitely needed).

> I was a bit difficult to explain, so please ask if anything is not
> clear.

Hopefully you can help with some of the above points, then I can give it
another go with fresh eyes and maybe show you a patch.

(If I get to that point, I will probably also need some advice on the proper
error handling)...

Anyway, from what you wrote and from what I see in the code, it seems the API
I propose is general enough to fit well with PBXT, which is good and what I
wanted to check (Even if xn_end_xact() may need to be taken apart a bit to
properly split into a "fast" and a "slow" part).

 - Kristian.



Follow ups

References