← Back to team overview

pbxt-discuss team mailing list archive

Re: goal 0 of embedde pbxt reached || (was: Re: PBXT: Embedded Database (library))

 

Hi again,

On Thu, Feb 11, 2010 at 11:45 PM, Paul McCullagh <
paul.mccullagh@xxxxxxxxxxxxx> wrote:

> Embedded InnoDB's API might be a good start and reference for an API
>> sketch: http://www.innodb.com/doc/embedded_innodb-1.0/
>>
>
> Yes, I read that through again. Most of it could be taken over 1 to 1.

Think so, too.


>
>
>  IMHO what is open and where I would really appreciate your feedback /
>> comments:
>> - what language should the API be in? C or C++?
>>
>
> Well, C has the advantage that it is easy to put a C++ wrapper around if
> you want to, the other way around is tricky. So unless there is a good
> reason, I would recommend a C API.

That's right. Will do a sketch in C as long as will not complicate things it
too much.


>
>  - In which format shall we store the table/db definitions? protobuff
>> maybe? Afair Drizzle does so, so we could borrow some code there...
>>
>
> protobuf may be an overkill for the initial implementation.
>
> How are you planning to do create table? The innodb API does it by building
> a create table structure with various API calls.
>
> By submitting a CREATE TABLE statement as text, you can save a lot of API
> routines.
>
> PBXT already has a parser for CREATE (and ALTER) table statements. So, you
> could accept the text and feed the parser.
>
> Then, you could actually store the table definition as a CREATE TABLE
> statement. When the table is loaded you just invoke the parser. The CREATE
> TABLE text could be stored in a separate file, like the .frm file, for each
> table.
>

> Alternatively the text could be stored in the header of the .xtd file,
> where I already store the foreign key information (the foreign key
> information is actually stored as SQL text).
>
> However, this may be going too far with the integration of the embedded
> code and PBXT itself.
>
I like this idea really really much -- its a nice separation and
self-hosting feature also.
I'd guess, the lexer code of mysql would have to be ported over then? If I
remember things correctly, lexer and THD are coupled which would complicate
things even further.


>
> Basically, what would be cool is if the embedded wrapper code controls the
> following:
>
> 1. The types of data stored.
>   - We can start with a few very basic types.
> 2. The format of a record in RAM
>   - This is the same format that PBXT uses on disk, as long as the records
> are fixed length
>   - For variable length records it uses a simple serialization method (as I
> mentioned before)
> 3. The format of index records
>   - with an interface to get and set data in a row, the engine does not
> need to actual format
> 4. The comparison of data types
>   - the wrapper provides routines to compare data types.
>   - These are mostly methods which are part of the data dictionary in RAM
> 5. The format of the data dictionary on disk, and in RAM
>   - the wrapper reads and writes this data.
>
> This will give us great flexibility to add data types and other
> complexities later.
>
Sounds great. Especially when having a data type registry for supplying
custom data types at runtime.


>
> It is also pretty much the division of work between MySQL code and PBXT
> today. However, the division is not so clear in the code.
>
>
>  - else, should table serialization / deserialization be pluggable or even
>> be purely programmatic? I am fine with this, too, as it is an _embedded_
>> library and I'd guess most people will control pbx programmatically anyways
>>
>
> Although I spoke mainly about the textual interface above, I am really
> flexible on this. I think both solutions have there advantages.
>
> Use whichever is best and easiest for you at the moment, which may be
> simply writing your own stuff! :)

I guess hardcoding table definitions will be the most easy way for now.


>
>
>  - library naming: are you fine with libembpbxt?
>>
>
> Yup, that sounds good.

Cool!


Gn8,
Martin

Follow ups

References