← Back to team overview

pbxt-discuss team mailing list archive

Re: Row buffers and Objects

 


On May 27, 2010, at 4:51 PM, Stewart Smith wrote:

On Thu, 27 May 2010 12:36:46 +0200, Paul McCullagh <paul.mccullagh@xxxxxxxxxxxxx > wrote:
On May 26, 2010, at 5:26 AM, Stewart Smith wrote:
class Tuple
{
Tuple(Tuple &t)
{ for(i= 0; i< t.nr_columns(); i++) { set_column(i, t.column(i)); } }
}

Currently, I can do this with one memcpy, in the cases when PBXT is
using a fixed length record structure.

But, I guess this is a small price to pay for proper encapsulation.

You could probably continue to do so, but the above would be the generic
solution.

Well, if I could, that would be great. I am thinking of the case where you have a row consisting of 100 integers.

That would mean the difference between 1 memcpy of 400 bytes or 100 4 byte memcpy's. A big difference in this case.

I would need a method that returns a pointer to the tuple buffer, and the length of the buffer. The method could return NULL if the tuple has no buffer, and then I could use the standard copy method.

(although some thought needs to be given to the endian problem -
currently byte order of data in the MySQL row buffer is in a processor
independent format, and can therefore be stored on disk without
further conversion).

Hrrm... probably just the same as today... Although I certainly would
not be placing bets on the endian independence currently properly working in all
cases (in MySQL or Drizzle).

I think there's ways to do it properly without too much hastle though.

For example:

int Tupple::column_endian(int colnr)

Based on the result, the engine would know whether to convert or not.


All I then need is for Drizzle to provide comparison routines for each
column.

In this way the engine does not need to know anything about the data,
and the interpretation of the data is always in sync with the server.

yep.


This is good. And it fits into what Brian is suggesting: turning the
record array into something like this:

Tuple record[2]

Personally, I'd like this to end up just being a memory pool instead, as
a lot of the time you don't actually need 2 records and it's just a
waste of memory (especially for large tables).

Yup, but a pool introduces another semaphore. And, there is already a Handler pool.

but yeah, as a intermediate step, it's probably what will happen.

Make sense.

Although this may present a problem with regard to the scope of
validity of Tuples returned by the engine.

The best for the moment would be that the Tuple is valid until the
next call to the engine on the Cursor that returned the Tuple.

agreed.

I wouldn't also mind a mechanism that could implement either:
a) copying of the tuple
b) retain/release

Yes, this would be useful. And easy, as long as Tuples are not shared between threads.

so that if the upper layer did need it for longer it could ask the
engine to keep it around and you'd either get a copy (by default) or if
the engine is clever, just a reference to the same bit of memory.

Yes, the engine may be able to do this without making a copy. That would be an important optimization.


--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com






Follow ups

References