← Back to team overview

maria-developers team mailing list archive

Summary of the storage engine API changes for group commit


Hi Serg,

As promised, here is a summary of the changes to the storage engine API that I
made as part of group commit.

Two new handlerton methods are added:

    void (*prepare_ordered)(handlerton *hton, THD *thd, bool all);
    void (*commit_ordered)(handlerton *hton, THD *thd, bool all);

commit_ordered() will be called just before commit(). It should commit the
transaction in memory; calls to commit_ordered() will be serialised, so order
of commit is determined by order of calls to commit_ordered(). MVCC engines
should make the transaction visible. No time-consuming operations like
flushing logs to disk should be done; this should happen in commit().

prepare_ordered() will be called just after prepare(). Calls will again be
serialised and happen in commit order, so an engine can do here any part of
the prepare step that depends on commit order. Again no time-consuming
operations like I/O should be done.

Order of calls to the new methods is guaranteed to be consistent between
engines and with commit order in the binlog. Locking is done so that START
TRANSACTION WITH CONSISTENT SNAPSHOT becomes consistent between storage

The methods are optional; a storage engine is free to not set them. Group
commit will still work, but there will be no guarantee that commits happen in
the same order in engine and in binlog, nor that START TRANSACTION WITH
CONSISTENT SNAPSHOT will be consistent between engines/binlog.

(The prepare_commit_mutex in built-in InnoDB will still ensure consistent
commit order, and will still destroy group commit).

There are no other changes to the storage engine API by me.

Hope this helps,

 - Kristian.

Follow ups