maria-developers team mailing list archive

Thread
Date

Re: Sachin weekly report

To: Sergei Golubchik <serg@xxxxxxxxxxx>, maria-developers@xxxxxxxxxxxxxxxxxxx
From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
Date: Tue, 23 Aug 2016 16:54:41 +0530
In-reply-to: <20160823110555.GA11945@meddwl.fritz.box>

Hi Sergei!

On Tue, Aug 23, 2016 at 4:35 PM, Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
>
> Hi, Sachin!
>
> On Aug 22, Sachin Setia wrote:
> > Hi Sergei!
> >
> > Actually I completed the work on update and delete. Now they will use
> > index for looking up records.
> >
> > But I am thinking I have done a lot of changes in optimizer which may
> > break it, and also there are lots of queries where my code does not
> > work, fixing this might take a long amount of time.  I am thinking of
> > a change in my existing code :-
> > Suppose a table t1
> >     create table t1 (a blob, b blob, c blob, unique(a,b,c));
> > In current code, for query like there will a KEY with only one
> > keypart which points to field DB_ROW_HASH_1.
> > It was okay for normal updates, insert and delete, but in the case
> > of where optimization  I have do a lot of stuff, first to match field
> > (like in add_key_part), then see whether all the fields in hash_str
> > are present in where or not, then create keys by calculating hash. I
> > do this by checking  the HA_UNIQUE_HASH flag in KEY, but this also
> > makes (I think) optimizer code bad because of too much dependence.
> > Also  I need to patch get_mm_parts and get_mm_leaf function, which I
> > think should not be patched.
>
> Later today I'll know exactly what you mean, when I'll finish
> reviewing your optimizer changes.
>
> But for now, let's say I agree on a general principle :)
> Optimizer is kinda complex and fragile, so it's good to avoid doing many
> changes in it - the effect might be difficult to predict.
>
> > I am thinking of a another approach to this problem at server level
> > instead of having just one keypart we can have 1+3 keypart. Last three
> > keyparts will be for field a, b, c and first one for
> > DB_ROW_HASH_1.These will be only at server level not at storage level.
> > key_info->key_part will point at keypart containing field a, while
> > key_part having field DB_ROW_HASH_1 will -1 index. By this way I do
> > not have to patch more of optimizer code. But there is one problem,
> > what should be the length of key_part? I am thinking of it equal to
> > field->pack_length(), this would not work because while creating keys
> > optimizer calls get_key_image() (which is real data so can exceed
> > pack_lenght() in case of blob), so to get this work I have to patch
> > optimizer  where it calls  get_key_image() and see if key is
> > HA_UNIQUE_HASH. If yes then instead of get_key_image just use
> >    memcpy(key, field->ptr(), field->pack_length());
> > this wont copy the actual data, but we do not need actual data. I will
> > patch handler methods like ha_index_read, ha_index_idx_read,
> > multi_range_read_info_const basically handler methods which are
> > related to index or range search.  In these methods i  need to
> > calculate hash, which I can calculate from key_ptr but key_ptr doe
> > not have actual data(in case of blobs etc).So to get the data for
> > hash, I will make a field clone of (a,b,c etc) but there ptr will
> > point in key_ptr. Then field->val_str() method will work simply and i
> > can calculate hash. And also I can compare returned  result with
> > actual key in handler method itself.
> > What do you think of this approach ?
>
> Looks simpler, agree. The length of the keypart should not matter,
> because it should never be used. May be it would be good to set it to -1
> as it might help to catch errors (where it is erroneously used).

I think it should , because we make buffer size according to key_ptr->length.
For example  , this code at test_quick_select

(param.min_key= (uchar*)alloc_root(&alloc,max_key_len)

here max_key_length is sum of lengths of all key_part->store_length
and also in function get_mm_leaf we use

  field->get_key_image(str+maybe_null, key_part->length,
                      key_part->image_type);

So I think length will matter.

>
> I didn't understand why you need to clone fields though :(

Ohh , this is actually just for reading data from key_ptr, because
key_ptr does not have direct
data  , so i thought that i have to make a clone of field and then set
its ptr. But I guess this
function would work.

inline void move_field(uchar *ptr_arg,uchar *null_ptr_arg,uchar null_bit_arg)
{
    ptr=ptr_arg; null_ptr=null_ptr_arg; null_bit=null_bit_arg;
  }

>
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
Regards
sachin

Follow ups

Re: Sachin weekly report
From: Sergei Golubchik, 2016-08-24

References

Re: Sachin weekly report
From: Sachin Setia, 2016-06-26
Re: Sachin weekly report
From: Sachin Setia, 2016-07-05
Re: Sachin weekly report
From: Sergei Golubchik, 2016-07-11
Re: Sachin weekly report
From: sachin setiya, 2016-08-13
Re: Sachin weekly report
From: Sergei Golubchik, 2016-08-14
Re: Sachin weekly report
From: Sachin Setia, 2016-08-18
Re: Sachin weekly report
From: Sergei Golubchik, 2016-08-19
Re: Sachin weekly report
From: Sachin Setia, 2016-08-22
Re: Sachin weekly report
From: Sergei Golubchik, 2016-08-23