← Back to team overview

drizzle-discuss team mailing list archive

Re: Improvements to Storage Engine API

 

Hi Jay,

On Dec 20, 2009, at 6:13 PM, Jay Pipes wrote:

Paul McCullagh wrote:
Hi Jay,
The new specs (http://drizzle.org/wiki/StorageEngineAPI_ProposedRefactoring ) look very good so far. :) Important point here: the API handles DML and DDL essentially the same way :)

Yes, this is the idea. :)

One question:
virtual bool passthruStatement(Session &session, message::Statement &statement); Would it actually be possible for the engine to entirely execute a SELECT statement in this method? If this is the case then passthruStatement() would need a callback which it would use to return the rows returned by the SELECT. Maybe it would make sense to just call a returnRow() type method on session.

Hmm, this is a very good point. This is kind of what the IO_CACHE and the READ_RECORD structures do right now, but internal to the kernel. Could you give some sample code that demonstrates what you are thinking about?

What about if passthruStatement() also returned a set of flags (or std::bitset for us C++ folks ;) ) which would indicate to the kernel what to do after the passthru statement has been executed? For a SELECT, passthru execution would mean what exactly? That the engine has "setup" the ability for the kernel to read data? That the engine has materialized a dataset to pass to the Session/kernel? Or something else? Again, I feel example code is easier to discuss with :)

OK, here's an example of what I mean:

Lets assume we have an engine that can do fast aggregates.

We have the following select:

SELECT c1, SUM(c2) FROM t1 GROUP BY c1;

The kernel calls:

engine->startStatement(session, gpb("SELECT c1, SUM(c2) FROM t1 GROUP BY c1"))

The engine recognizes that it can execute the entire statement internally (no need for any optimization or execution by the kernel) so it returns StorageEngine::INTERNAL.

As a result, the kernel calls:

engine->passthruStatement(session, gpb("SELECT c1, SUM(c2) FROM t1 GROUP BY c1"))

The engine implements passthruStatement(), as follows:

bool passthruStatement(Session &session, message::Statement &statement)
{
  switch (statement.type()) {
  ...
  case message::Statement::SELECT:
  // The engine builds an execution plan
  plan = new Plan(statement);
  while (!plan->eof()) {
    session->returnRow(plan->getRow());
  }
  break;
  ...
  }
}

So in passthruStatement() the engine actually executes the entire SELECT statement, and returns the result rows.

This code would only be used if the engine executes the entire SELECT. If the engine executes only parts of the statement, then the cursor interface will be used.

The alternative would be to use the cursor interface for this case as well. This would look like this (from the kernels point of view):

intent = engine->startStatement(session, *statement);

if (likely(intent == StorageEngine::DEFAULT))
{
  // do the "normal", unoptimized set of operations
}
else if (intent == StorageEngine::INTERNAL)
{
  // Let the engine do it internally
  if (statement->type() == message::Statement::SELECT)
  {
// Call a special version of getCursor() that handles an entire statement:
    // The engine has said it can handle this:
    cursor = engine->getCursor(*statement);
    if (cursor->readFirst(row_out))
    {
      session->returnRow(row_out);
      while (cursor->readNext(row_out))
      {
        session->returnRow(row_out);
      }
    }
    delete cursor;
  }
  else
  {
    if (engine->passthruStatement(session, *statement) == false)
    {
       // Ask engine for errors to report to the user...
    }
  }
}

What I have not mentioned above is the need for the engine to specify the meta data of the return rowset. In other words, there needs to be a way for the engine to specify the names and types of the returned columns.

I have been in many engine developer meetings and it has often come up that engine developers say: we would love to have a way to execute an entire statement. This would allow for that.


Cheers!

jay


Best regards,
Paul
On Dec 16, 2009, at 10:27 PM, Jay Pipes wrote:
Hey guys,

I've been working on a wiki page explaining the proposed changes we've been discussing on the mailing list:

http://drizzle.org/wiki/StorageEngineAPI_ProposedRefactoring

I'm going to keep working on it and post it tomorrow, but in the meantime, if you could look it over and edit it if you'd like, that would be great :)

Thanks!

Jay
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com



--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com






Follow ups