pbxt-discuss team mailing list archive
-
pbxt-discuss team
-
Mailing list archive
-
Message #00018
Re: goal 0 of embedde pbxt reached || (was: Re: PBXT: Embedded Database (library))
Hi Martin,
On Feb 11, 2010, at 5:12 PM, Martin Scholl wrote:
As this is my first posting, I'd like to say "hello" to you all.
On Thu, Feb 11, 2010 at 4:19 PM, Paul McCullagh <paul.mccullagh@xxxxxxxxxxxxx
> wrote:
[snip]
I guess the next step would be to defined an interface, and begin
the implementation.
Beneath the missing bloody THD- and codeset / charset-related
stuff... :-)
Even more, you will notice a lot of assert(0)s in the current
version...
To be clear: the current state of the code is solely to "make it
compile" instead of "be maintable" or even "be clean". Please keep
this in mind when reading my "code".
Understand, but it is a good start...
This raises the question of whether to use the MySQL handler
interface, or to go in and replace ha_pbxt.cc altogether.
I'd propose to skip ha_pbxt.cc altogether and stick with a dedicated
(and pbly simplified) embedded API. ha_pbxt.cc is not part of the
current build-set anyways. :-)
Yes, OK.
One of the things I like the most about libraries like Tokyo Cabinet
is its straight-forward API. I would love to see embedded PBXT be
easy like this as well.
Absolutely agree. As few API calls as possible, and they should be
easy to understand.
Embedded InnoDB's API might be a good start and reference for an API
sketch: http://www.innodb.com/doc/embedded_innodb-1.0/
Yes, I read that through again. Most of it could be taken over 1 to 1.
IMHO what is open and where I would really appreciate your
feedback / comments:
- what language should the API be in? C or C++?
Well, C has the advantage that it is easy to put a C++ wrapper around
if you want to, the other way around is tricky. So unless there is a
good reason, I would recommend a C API.
- In which format shall we store the table/db definitions? protobuff
maybe? Afair Drizzle does so, so we could borrow some code there...
protobuf may be an overkill for the initial implementation.
How are you planning to do create table? The innodb API does it by
building a create table structure with various API calls.
By submitting a CREATE TABLE statement as text, you can save a lot of
API routines.
PBXT already has a parser for CREATE (and ALTER) table statements. So,
you could accept the text and feed the parser.
Then, you could actually store the table definition as a CREATE TABLE
statement. When the table is loaded you just invoke the parser. The
CREATE TABLE text could be stored in a separate file, like the .frm
file, for each table.
Alternatively the text could be stored in the header of the .xtd file,
where I already store the foreign key information (the foreign key
information is actually stored as SQL text).
However, this may be going too far with the integration of the
embedded code and PBXT itself.
Basically, what would be cool is if the embedded wrapper code controls
the following:
1. The types of data stored.
- We can start with a few very basic types.
2. The format of a record in RAM
- This is the same format that PBXT uses on disk, as long as the
records are fixed length
- For variable length records it uses a simple serialization
method (as I mentioned before)
3. The format of index records
- with an interface to get and set data in a row, the engine does
not need to actual format
4. The comparison of data types
- the wrapper provides routines to compare data types.
- These are mostly methods which are part of the data dictionary
in RAM
5. The format of the data dictionary on disk, and in RAM
- the wrapper reads and writes this data.
This will give us great flexibility to add data types and other
complexities later.
It is also pretty much the division of work between MySQL code and
PBXT today. However, the division is not so clear in the code.
- else, should table serialization / deserialization be pluggable or
even be purely programmatic? I am fine with this, too, as it is an
_embedded_ library and I'd guess most people will control pbx
programmatically anyways
Although I spoke mainly about the textual interface above, I am really
flexible on this. I think both solutions have there advantages.
Use whichever is best and easiest for you at the moment, which may be
simply writing your own stuff! :)
- library naming: are you fine with libembpbxt?
Yup, that sounds good.
If you use the handler interface, then you will have to continue to
simulate MySQL, which may not suite the API (you will have to call
the handler functions in the same order that MySQL does).
If you replace ha_pbxt, then you will have to nevertheless include
some of the functionality in this code. For example, you should
take over the init and shutdown code.
What you need to keep is the "cursor" type paradigm.
What I mean is, to do and index or table scan you do the following:
- open a cursor for a table
* which means grap an XTOpenTable from the table pool
- call init
* Initialize the scan.
- Call search and next in a loop.
- call exit
* Free resources
- close the cursor
* which means return the open table to the pool
All such actions need to be enclosed in a:
- begin transaction
...
- commit/rollback transaction
The transaction is per thread, and all relevant information is
stored in the XTThread structure.
Ok, a lot of open questions are answered by this. Thank you, Paul!
[snip]
Martin
P.S.: I will set up a TODO file to make it easier to track embedded
PBXT's progress
OK, great.
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
Follow ups
References