← Back to team overview

drizzle-discuss team mailing list archive

Re: Improving the Engine API

 

Paul McCullagh wrote:
On Dec 8, 2009, at 3:48 PM, Jay Pipes wrote:
<snip>


Whatcha say? :)

Absolutely, it will work. And maybe it is the best solution.

I would have suggested that Drizzle assumes that DDL works with transactions, and that the engine handles the situation by doing one of the following:

- Execute the DDL normally, because it supports DDL in transactions,
- Return an error because it does not handle DLL in a transaction, or
- Silently commit the transaction, and begin it again at the end of the DML statement, without Drizzle even knowing about it.

But, I guess the problem is replication. If Drizzle is not aware of a transaction end, then it would be replicated as such, and we may end up with a different result on the slave.

The problem of DDL in a transaction only occurs when auto-commit is disabled, or an explicit BEGIN is used, so lets look at a quick example:

BEGIN;
INSERT t1 VALUES (1, 1);
INSERT t1 VALUES (2, 2);
ALTER TABLE t1 ADD COLUMN c3 INT;
INSERT t1 VALUES (3, 3, 3);
INSERT t1 VALUES (4, 4, 4);
COMMIT;

A silent COMMIT on DML will lead to the following:

BEGIN;
INSERT t1 VALUES (1, 1);
INSERT t1 VALUES (2, 2);
COMMIT;
ALTER TABLE t1 ADD COLUMN INT c3;
BEGIN;
INSERT t1 VALUES (3, 3, 3);
INSERT t1 VALUES (4, 4, 4);
COMMIT;

While I am no fan of a silent COMMIT, it may be the best solution, because at least this sequence of statements will be compatible with engines that support DDL in transactions and those that don't (much like MyISAM happily ignores BEGIN TRANSACTION).

The alternative would be to return an error. This would prevent the surprise affect that I get when the server crashes and I discover my transaction was not atomic after all.

Well, I'm not a huge fan of implicit anything, as you know, but in this case, since engines do have a certain leeway in how they advise the kernel that they will handle a statement, I'm OK with continuing the existing MySQL behaviour of implicitly committing transactions before DDL statements are executed -- but in Drizzle's case, only if the engine advises it is unable to include the DDL in the current transaction.

I guess this is a question for the DBA's on the list ... input please! :)

++

Also, one other thing we need to discuss is the following, which you alluded to in an earlier email:

Suppose PBXT can handle ADD INDEX in an optimized fashion, but PBXT does not implement the remainder of the ALTER TABLE statement and prefers Drizzle's kernel to handle the other operations. In this particular case, we need a way of allowing the engine to communicate that it would like to handle *some part* of a statement internally, and let the kernel handle other parts. This is an interesting problem, and I can see at least three possible solutions. Let me know what you think of either of these:

1) Establish two more flags for the StatementExecutionIntent:

INTENT_INTERNAL_AFTER_KERNEL
INTENT_KERNEL_AFTER_INTERNAL

In the first flag, the engine is telling the kernel that it wishes to execute some part of the Statement *after* the kernel has finished executing the statement. In the second flag, the engine is telling the kernel it wants first crack at the statement.

I can see this solution being of medium-difficulty to implement, as lots of edge cases would have to be tested...

2) Don't add new flags to StatementExecutionIntent, but instead have the engine "do its thing" (e.g. optimizally implement the ADD INDEX part of an ALTER TABLE) in the call to StorageEngine::endStatement().

3) Create a new plugin type: plugin::PostKernelStatementExecute:

namespace drizzled {
namespace plugin {

/**
 * Modules implement a subclass of this class and register
 * an instance of the class as a "listener" for when the
 * kernel has completed execution of a Statement.
 *
 * For example, a storage engine might implement a subclass
 * called OptimizedAddIndex which would listen for the kernel's
 * completed execution of an ALTER TABLE statement and execute
 * an optimal ADD INDEX clause for the ALTER TABLE statement.
 */
class PostKernelStatementExecute
{
public:
  bool operator()
    (Session &session, const message::Statement &statement);
};
}
}

Have the engine register a PostKernelStatementExecute trigger/hook. This PostKernelStatementExecute would be a subclass of the above plugin interface class and would allow the engine to react to certain types of statements that it asks the kernel to execute "normally" but wants to add some optimized path for...

Thoughts?

Jay



Follow ups

References