← Back to team overview

maria-developers team mailing list archive

Re: Can there be a better storage engine API?


Hi Mark,

TokuMX is a quite different beast than TokuDB.  First of all, we already
had the experience of integrating our engine into one database product
before we started.  So many kinks in the TokuKV layer had already been
worked out.

But more importantly, TokuMX/MongoDB doesn't have a storage engine API.  I
think some people thought we were going to add a storage engine API to
MongoDB and then plug ourselves into it.  That wasn't the goal of TokuMX,
the goal was simply to get our engine inside MongoDB as fast as possible,
and the way to do that was to avoid thinking about what would be a good
interface and instead to just do it.  As everyone here I'm sure knows,
making a good storage engine API is /really/ hard.

Probably the hardest things in the TokuMX integration were learning how to
deal with DDL (everything in MongoDB seems to use "lazy
initialization"---for DDL operations at least), finding the right model
within the MongoDB code to represent transactions, and reorganizing the
locking.  All these things were tightly coupled with the way the MongoDB
storage system works (except transactions, well, because they didn't
exist), but now in TokuMX they're pretty tightly coupled with the way
TokuKV does things.

In a way, we've created a storage API, but the API is defined by our
version of db.h and nothing else implements that with the same assumptions
we have, so it's probably not useful to compare the "TokuMX storage engine
API" with the one in MySQL.

In short, I'd say yes it was easier, but not because MongoDB has a better
API (it doesn't have one), but because we had a bit of experience and
because we didn't try to create or conform to a generic API.

On Mon, Aug 19, 2013 at 11:36 AM, MARK CALLAGHAN <mdcallag@xxxxxxxxx> wrote:

> Thanks for your response.
> On Fri, Aug 16, 2013 at 11:23 AM, Zardosht Kasheff <zardosht@xxxxxxxxx>wrote:
>> I've worked on the TokuDB storage engine for quite a while now. I have
>> had many experiences over the years, so I guess it's hard to know
>> where to begin. I guess I will start small, and if the conversation
>> evolves, I can contribute more thoughts. I think the current API is
>> really good, as evidenced by the fact that many storage engines have
>> used it to plug into MySQL. The two areas that I see we can really
>> benefit from are the following:
> Many were written in the long-ago past. Besides TokuDB how many new
> storage engines have reached GA in the past decade? I worked on a custom
> storage engine and I am sure others have done the same, but there hasn't
> been much innovation in the public. Aria is also GA, but that was written
> by people who know and wrote parts of the API, so it isn't a sign that the
> API is something people want to use.
> Was TokuMX easier to implement than TokuDB?
> --
> Mark Callaghan
> mdcallag@xxxxxxxxx
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-developers
> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-developers
> More help   : https://help.launchpad.net/ListHelp