← Back to team overview

drizzle-discuss team mailing list archive

Re: Improving the Engine API (was Re: New PBXT Drizzle-specific storage engine...)


Paul McCullagh wrote:
On Dec 16, 2009, at 11:58 PM, Jay Pipes wrote:
Paul McCullagh wrote:
Hi Toru,
On Dec 8, 2009, at 6:37 PM, Toru Maesaka wrote:
Hi Paul,

- If I have a update type statement (i.e. whether the statement modifies
- Whether I need a table lock (examples: ALTER TABLE, TRUNCATE, CHECK).
- If we have a SELECT FOR UPDATE.

Agreed! Especially for the third point. For me this is what I need:

- Whether the statement will change the table state (so, updates in general)
- Whether the entire table needs to be locked.
- Whether the statement only performs READ operations.

The third point I don't think needs to be explicitly defined but it
would be relieving for the engine to know (or be guaranteed) that the
table state will not be changed.

Whether the storage engine should obey the statement characteristics
or not is up to the engine developer I guess... Nonetheless it would
be brilliant as a "hint" for everyone I think.
The question here is whether the engine should lock the tables in the startStatement() call, or when the cursor is used?
Lets look at an UPDATE statement:
UPDATE t1, t2 SET t1.c1=50 WHERE t1.id = t2.id and t2.c3='abc';
In this statement, t1 is being read and updated, and t2 is just being read. Both tables are being scanned (lets assume there are no indexes).

Drizzle does not support multi-table UPDATEs in a single statement. For Drizzle, the above statement would have to be written like so:

UPDATE t1 SET t1.c1=50
WHERE t1.id IN (
 SELECT t2.id FROM t2 WHERE t2.c3='abc'

Yes, OK. I think this makes it even more obvious that t2 is being read, and t1 is being read and updated.

Here is some pseudo code for the execution of this statement:
a = engine->getCursor("t1", WILL_UPDATE)
b = engine->getCursor("t2", READ_ONLY)

Actually, the above is not correct. Because the update to t1 is dependent on rows in a consistent snapshot of t2, you would actually have to do (again, in pseudocode):

b= engine->getCursor("t2", REPEATABLE_READ);
a= engine->getCursor("t1", WILL_UPDATE);

Yes, generally I agree.

In fact, WILL_UPDATE may be: COMMITTED_READ | FOR_UPDATE (at least I believe this is the case with InnoDB).

Unless the engine uses "optimistic locking" (which is the pure MVCC approach), and then it would be: REPEATABLE_READ | FOR_UPDATE.

So this raises the question of whether the use of REPEATABLE_READ and COMMITTED_READ is not telling the engine too much about how these things should be implemented...?

No, I don't view it as implementation, really. The REPEATABLE_READ or COMMITTED_READ is something the kernel has to communicate to the engine because it is what the *user* has asked for -- either explicitly or via a configuration default. So, it's not really the kernel saying to the engine "do it this way". Instead, it's the kernel saying "this is what the user is requiring." So, in that sense, I believe it absolutely belongs in the API.

I want to flesh out the example code we've been working on in this thread some more. I will do this over the holidays and post the code to the wiki I've been using. I'll take the approach of demonstrating how the kernel would call the engine for various SQL scenarios. Hopefully, that will give us a rounded look at the way the kernel can interact with the engine and the engine's cursors.



Follow ups