← Back to team overview

maria-developers team mailing list archive

Re: Sachin weekly report

 

Hi, Sachin!

On Aug 22, Sachin Setia wrote:
> Hi Sergei!
> 
> Actually I completed the work on update and delete. Now they will use
> index for looking up records.
> 
> But I am thinking I have done a lot of changes in optimizer which may
> break it, and also there are lots of queries where my code does not
> work, fixing this might take a long amount of time.  I am thinking of
> a change in my existing code :-
> Suppose a table t1
>     create table t1 (a blob, b blob, c blob, unique(a,b,c));
> In current code, for query like there will a KEY with only one
> keypart which points to field DB_ROW_HASH_1.
> It was okay for normal updates, insert and delete, but in the case
> of where optimization  I have do a lot of stuff, first to match field
> (like in add_key_part), then see whether all the fields in hash_str
> are present in where or not, then create keys by calculating hash. I
> do this by checking  the HA_UNIQUE_HASH flag in KEY, but this also
> makes (I think) optimizer code bad because of too much dependence.
> Also  I need to patch get_mm_parts and get_mm_leaf function, which I
> think should not be patched.

Later today I'll know exactly what you mean, when I'll finish
reviewing your optimizer changes.

But for now, let's say I agree on a general principle :)
Optimizer is kinda complex and fragile, so it's good to avoid doing many
changes in it - the effect might be difficult to predict.

> I am thinking of a another approach to this problem at server level
> instead of having just one keypart we can have 1+3 keypart. Last three
> keyparts will be for field a, b, c and first one for
> DB_ROW_HASH_1.These will be only at server level not at storage level.
> key_info->key_part will point at keypart containing field a, while
> key_part having field DB_ROW_HASH_1 will -1 index. By this way I do
> not have to patch more of optimizer code. But there is one problem,
> what should be the length of key_part? I am thinking of it equal to
> field->pack_length(), this would not work because while creating keys
> optimizer calls get_key_image() (which is real data so can exceed
> pack_lenght() in case of blob), so to get this work I have to patch
> optimizer  where it calls  get_key_image() and see if key is
> HA_UNIQUE_HASH. If yes then instead of get_key_image just use
>    memcpy(key, field->ptr(), field->pack_length());
> this wont copy the actual data, but we do not need actual data. I will
> patch handler methods like ha_index_read, ha_index_idx_read,
> multi_range_read_info_const basically handler methods which are
> related to index or range search.  In these methods i  need to
> calculate hash, which I can calculate from key_ptr but key_ptr doe
> not have actual data(in case of blobs etc).So to get the data for
> hash, I will make a field clone of (a,b,c etc) but there ptr will
> point in key_ptr. Then field->val_str() method will work simply and i
> can calculate hash. And also I can compare returned  result with
> actual key in handler method itself.
> What do you think of this approach ?

Looks simpler, agree. The length of the keypart should not matter,
because it should never be used. May be it would be good to set it to -1
as it might help to catch errors (where it is erroneously used).

I didn't understand why you need to clone fields though :(

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


Follow ups

References