pbxt-discuss team mailing list archive

Thread
Date

Re: Row buffers and Object (was Re: free_table_share() != drizzle)

To: Paul McCullagh <paul.mccullagh@xxxxxxxxxxxxx>
From: Stewart Smith <stewart@xxxxxxxxxxxxxxxx>
Date: Wed, 26 May 2010 13:26:26 +1000
Cc: PBXT Discuss <pbxt-discuss@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <E528CBD9-A7A0-492C-AB15-66CD520DD75D@primebase.org>
User-agent: Notmuch/0.3.1-17-gc50524e (http://notmuchmail.org) Emacs/23.1.1 (x86_64-pc-linux-gnu)

On Tue, 25 May 2010 23:41:55 +0200, Paul McCullagh <paul.mccullagh@xxxxxxxxxxxxx> wrote:
> > I've been thinking of a Tuple object instead, so we don't always  
> > have to
> > have the entire row buffer around if only a subset of rows are  
> > operated
> > on (this could be more advantageous for column based engines).
> 
> I guess you mean a "subset of columns".

yep.

> I guess by "tuple" you just mean an object where I can get and set  
> field values but don't know how they are stored. Or do you mean  
> something else?

yes. the code manipulating the tuple doesn't care how they are stored.

> > I'm also thinking that Tuple would just be an interface, and engines
> > could provide their own implementation.
> >
> > With accessor methods, then the main server code could examine rows in
> > their native format instead of always having to convert from engine to
> > server formats. We would then only convert rows to over-the-wire
> > (server) format when they go over the wire.
> 
> This would mean the engine would supply the storage for the row. This  
> may indeed be more efficient.

It could do. e.g. if it provides an implementation that reads straight
from the buffer pool we would avoid a copy.

We would need pretty clear semantics about who own what bit of memory
though :)

> The engine already provide storage for the BLOBs, so I guess it would  
> not be a problem if the tuple has the same scope as BLOBs today (which  
> means the tuple is valid until the next row is retrieved on the
> cursor).

We'll have to have some interesting rules no doubt... but 

> But, how would you then handle doInsertRow()?

Server provides something using its format. e.g. when we have magical
(and not sucky) prepared statements, we could send rows over the wire
in a binary format and pass a Tuple to the engine that is natively in
this format. Then the conversion happens straight to whatever the engine
wants. No intermediary server row format.

> The natural way to handle this would be for the front-end to provide  
> the tuple to be inserted.

yep.

> Would you add a call: getTuple(), which returns an empty tuple, and  
> the call setField() for each column on the tuple and then call  
> doInsertRow(tuple)?
> 
> Seems rather awkward to me. Or do you have a different idea?

I'd imagine something like this:

class Tuple
{
 Tuple(Tuple &t)
 { for(i= 0; i< t.nr_columns(); i++) { set_column(i, t.column(i)); } }
}

class OverTheWireTuple : class Tuple

class PBXTTuple : class Tuple
{
        int set_column(int colnr, Value v)
        {
          /* convert value into pbxt format and store in this PBXTTuple */
        }
}

int PBXTCursor::doInsertRecord(Tuple &tuple)
{
        PBXTTuple pbxt_tuple(tuple);
        pbxt_write_row(pbxt_tuple);
}


Where the upper layer has read off the wire a tuple, constructed a
OverTheWireTuple, which then gets handed to PBXT. Because PBXT doesn't
want it in that format, it converts it to its format.

For reading a row, PBXT hands back a PBXTTuple, so for WHERE conditions
and the like the upper layer just checks the value in the
PBXTTuple. Only if the Row is going to back to the user over the wire
does it need to be converted into a OverTheWireTuple.

A temp only engine could just use the OverTheWire format and *never* do
a conversion.


> >> So the Row object is a "per thread" object. You call row_object-
> >>> setBuffer(u_char *buf) to set the row buffer and then you use an
> >> array of fields to get and set the data in the buffer.
> >
> > We could do RowBuffer(row_buffer, ptr) instead, so that it's  
> > impossible
> > to ever forget to set the buffer back to something.
> 
> I am not sure what you mean here. Is RowBuffer() a constructor? What  
> is row_buffer, and what is ptr in this case?

yeah, a class that is just an accessor to a row buffer - you point it at
a bit of memory.

> Actually I see Field as part of record[] today by the fact that 'ptr'  
> references the record[0] buffer.

> Getting a field to reference any other buffer is a fudge at the
> moment.

Yes, but one you have to do if you're not going to manipulate things
directly :(

> But otherwise, you are right, my suggestion would mean that this
> binding between row buffer and Field becomes even more close.
> 
> Although, with setBuffer() as I propose above, all fields can be set  
> to reference a different buffer.
> 
> This is the same as move_field_offset(), but instead of changing one  
> field, you change all fields in the row at once.

Yes, this would be better. It more than casually sucks to have all these
extra CPU instructions :)

> So there are 2 possibilities:
> 
> 1. Keep one set of fields, as it is at the moment, and switch from  
> record[0] to record[1] as required, or

this is the simplest and easiest change.

> 2. Turn record[] into an array of Row object where each row reference  
> a different row buffer (then no switching is required).

> There is a 3rd possibility, which will probably require a lot of code  
> change:
> 
> - Provide the row buffer, on each call to a Field method. In this  
> case, the Row object would be buffer independent, and could be used by  
> multiple threads.

This would likely be the most efficient. Keeping 2 copies of Field
objects is rather wasteful.

> - Get the collation sequence of a column.
> - Do string comparison operations with field data, and the collation  
> sequence
> - Do comparison of decimal encoded values, and other special data types

Yep. I'd hope for these too. Seems several engines really want this to
work properly.

-- 
Stewart Smith

Follow ups

Re: Row buffers and Objects
From: Paul McCullagh, 2010-05-27

References

free_table_share() != drizzle
From: Brian Aker, 2010-05-08
Re: free_table_share() != drizzle
From: Paul McCullagh, 2010-05-10
Re: free_table_share() != drizzle
From: Brian Aker, 2010-05-11
Re: free_table_share() != drizzle
From: Paul McCullagh, 2010-05-14
Re: free_table_share() != drizzle
From: Brian Aker, 2010-05-14
Re: free_table_share() != drizzle
From: Paul McCullagh, 2010-05-14
Re: free_table_share() != drizzle
From: Brian Aker, 2010-05-14
Re: free_table_share() != drizzle
From: Paul McCullagh, 2010-05-17
Re: free_table_share() != drizzle
From: Brian Aker, 2010-05-17
Re: free_table_share() != drizzle
From: Paul McCullagh, 2010-05-20
Re: free_table_share() != drizzle
From: Brian Aker, 2010-05-20
Row buffers and Object (was Re: free_table_share() != drizzle)
From: Paul McCullagh, 2010-05-21
Re: Row buffers and Object (was Re: free_table_share() != drizzle)
From: Stewart Smith, 2010-05-24
Re: Row buffers and Object (was Re: free_table_share() != drizzle)
From: Paul McCullagh, 2010-05-25