maria-developers team mailing list archive

Thread
Date

Re: GSoC weekly reports (Unique indexes for blobs)

To: Jan Lindström <jan.lindstrom@xxxxxxxxxxx>, maria-developers@xxxxxxxxxxxxxxxxxxx, Sergei Golubchik <serg@xxxxxxxxxxx>
From: Shubham Barai <shubhambaraiss@xxxxxxxxx>
Date: Thu, 4 Aug 2016 00:14:38 +0530
In-reply-to: <CALxAEPsmqbaGinSELC4aw3uhi_zNK_7yvsWD=BfCzrwSHxK6oQ@mail.gmail.com>

GSoC (week 10)

1.solved the problem which was causing a crash for the optimized version.

2.modified some functions for renaming a column using alter table.

3.At first,the server crashed for queries like "select column_name from
table_name "  if the column was present in the hash index. The problem got
solved by modifying ha_innobase::index_flags().

4.For insert queries with "on duplicate key update col_name.....  " ,if
there is a duplicate key, write_record() always calls
ha_index_read_idx_map() to get the row which needs to be updated. As hash
index doesn't store actual key values ,it is not possible to get the
required row by scanning hash index.So I modified write_record to use
sequential scan if the duplicate key error is in  hash index.


Regards,
Shubham

On 28 July 2016 at 00:26, Shubham Barai <shubhambaraiss@xxxxxxxxx> wrote:

> GSoC (week 9)
>
> InnoDB
>
> 1. After modifying row_merge_tuple_cmp to compare actual fields for the
> hash index, adding  unique index through alter table was not working. It
> was due to tuples that were inserted in the order of user defined fields
> but for the hash index, they should have been inserted in the order of hash
> field. I solved the problem and it is working now.
>
> 2. implemented comparison of externally stored columns. Now unique
> constraints work for blobs with really large values.
>
> 3.fixed an issue with InnoDB index translation table.
>
> 4.modified some functions in row0row.cc and row0merge.cc to compare
> externally stored columns for the hash index.
>
> So far insert, update and delete works for almost all cases. In alter
> table, rename column is remaining.
>
>
> Regards,
> Shubham
>
> On 21 July 2016 at 00:00, Shubham Barai <shubhambaraiss@xxxxxxxxx> wrote:
>
>> GSoC (week 8)
>>
>> MyISAM
>>
>> 1.Finally solved a problem which was causing a server to crash in an
>> update query.Now the test results for update operation looks promising.
>>
>> 2. Changed the order of keys in MyIsam.Now the order of keys will be same
>> in sql layer and MyISAM.
>>
>> 3.modified check_definition.
>>
>> 4.added tests for alter table and update operation.
>>
>> current branch for MyIsam : Myisam
>>
>> InnoDB
>>
>> 1.Fixed a bug in update operation.
>>
>> 2.added tests for alter table
>>
>> current branch for Innodb : temp
>>
>> On 13 July 2016 at 13:47, Shubham Barai <shubhambaraiss@xxxxxxxxx> wrote:
>>
>>> GSoC (week 7)
>>>
>>> Hello everyone,
>>>
>>> After implementing actual  row comparison, there was a problem with
>>> retrieving multiple clustered records with same hash value. The problem was
>>> solved by putting search function of clustered record in the mini
>>> transaction. The next thing I implemented in this week is alter table
>>> operation. The most of alter table operations are working fine except
>>> renaming a column.I modified the function which renames the column and name
>>> of fields in an index containing that column in  dictionary cache. It works
>>> fine till the server is running but after restarting the server, the error
>>> is generated that table doesn't exist in the storage engine. After
>>> debugging I found out that changes are not getting written to disk for the
>>> hash index. This might be because I have shifted index->fields pointer.
>>>     The another problem I was trying to solve was a duplicate-key error
>>> message when the unique index is added through alter table operation and
>>> there are already duplicate key entries in the table.  The error message
>>> contained null values at first but I have solved the problem now and it
>>> displays the correct error message.
>>>
>>> Following operations from alter table are working.
>>> 1.alter table add column.
>>> 2.alter table drop column(if the column was present in hash index, hash
>>> value is recalculated)
>>> 3.alter table add index/ drop index.(table is rebuilt if the hash index
>>> is dropped as hash column in the table has to be dropped also)
>>> 4.alter ignore table add index.
>>>
>>> On 5 July 2016 at 00:33, Shubham Barai <shubhambaraiss@xxxxxxxxx> wrote:
>>>
>>>> GSoC (week 6)
>>>>
>>>> Hello everyone,
>>>>
>>>> 1. defined some new functions to get the clustered record from
>>>> secondary index record and extract its key value to compare with the
>>>> secondary index key.It works for single clustered record. Currently trying
>>>> to solve the problem with multiple records with same hash value.
>>>>
>>>> 2.implemented some new functions for the update operation.
>>>>
>>>>     2.1 a function which checks if hash columns in a table need to be
>>>> updated.
>>>>     2.2 a  function to add hash fields in update vector.
>>>>
>>>> 3.When updating a row, sql layer calls index_read function for faster
>>>> retrieval of a row if the column used in where clause is one of the keys or
>>>> a part of the key. So I modified index_read function to convert mysql
>>>> search key in innobase format and then create a new search key with a hash
>>>> value.As hash index stores only hash value, it will be only possible to
>>>> search a row with a hash index if all of the key parts are present in
>>>> search key.
>>>>
>>>> current branch for InnoDB : temp
>>>>
>>>> On 27 June 2016 at 19:23, Shubham Barai <shubhambaraiss@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> GSoC (week 5)
>>>>>
>>>>>
>>>>> Hello everyone,
>>>>>
>>>>> Here is the list of things I have done in the 5th week of GSoC.
>>>>>
>>>>> 1.implemented unique key violation with a hash collision. (actual row
>>>>> comparison is remaining ).
>>>>>
>>>>> 2.modified hash function for data types like varchar and binary data
>>>>> types.
>>>>>
>>>>> 3.fixed a bug which was causing a server to crash for complex unique
>>>>> keys.
>>>>>
>>>>> 4.added support to allow any number of nulls which will not cause any
>>>>> unique key violation.
>>>>>
>>>>> 5.added test cases for above features.
>>>>>
>>>>> On 22 June 2016 at 16:36, Shubham Barai <shubhambaraiss@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> can we discuss on IRC first?
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Shubham
>>>>>>
>>>>>> On 22 June 2016 at 13:21, Jan Lindström <jan.lindstrom@xxxxxxxxxxx>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please commit and push these changes to your git branch I have not
>>>>>>> yet seen them, in my opinion as this is only a working branch you can push
>>>>>>> often. I still fail to see any test cases on InnoDB branch, do you have
>>>>>>> more than one branch and if you have why ? Depending on extent of these
>>>>>>> changes my estimate is that you are behind schedule to complete project in
>>>>>>> time. Based on your progress report you are still missing update and delete
>>>>>>> and redo-logging. For alter table you should start from forcing copy-method
>>>>>>> and then if time permits develop on-line method. This naturally only after
>>>>>>> everything else
>>>>>>> has been completed and tested.
>>>>>>>
>>>>>>> R: Jan
>>>>>>>
>>>>>>> On Tue, Jun 21, 2016 at 11:06 PM, Shubham Barai <
>>>>>>> shubhambaraiss@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> GSoC (week 4)
>>>>>>>>
>>>>>>>> Hello everyone,
>>>>>>>>
>>>>>>>> After working on create table operation,next thing I had to work on
>>>>>>>> was insert operations.So I explored some of the functions like
>>>>>>>> row_ins_scan_index_for_duplicates, btr_pcur_get_rec to get clear
>>>>>>>> understanding about how to implement duplicate search on hash index.
>>>>>>>> There was a problem in hash function that I wrote .It would
>>>>>>>> calculate same hash value for two different keys if the prefix length of
>>>>>>>> blob key part was zero. Now it seems to be working after I checked it in
>>>>>>>> debugger.I still have to modify it for data types like varchar etc.
>>>>>>>> I have added test cases for insert operations in myisam.
>>>>>>>> In MyIsam, I found one problem in update operation. When updating a
>>>>>>>> row,if the key is conflicting then server crashes because some pointer goes
>>>>>>>> invalid in compare_record. I haven't fixed this issue yet.
>>>>>>>>
>>>>>>>> I also modified some functions in dict0load.cc to  adjust some
>>>>>>>> members of dict_index_t for a new index type.The main problem is that index
>>>>>>>> entry for hash based index cointains only two fields(hash value and row id)
>>>>>>>> while dict_index_t  contains hash field and other user defined fields which
>>>>>>>> are used to calculate hash value.Some of the operations like alter table(
>>>>>>>> e.g. rename column) needs to get access to all fields while other functions
>>>>>>>> like rec_get_offsets and row_build_index_entry_low needs to get access to
>>>>>>>> only hash field and row id. I am still working on this to find efficient
>>>>>>>> solution to this problem.
>>>>>>>>
>>>>>>>> On 16 June 2016 at 23:29, Sergei Golubchik <vuvova@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi, Shubham!
>>>>>>>>>
>>>>>>>>> What I wanted to say on IRC was:
>>>>>>>>>
>>>>>>>>> here's what the comment of cmp_dtuple_rec_with_match_low() says:
>>>>>>>>>
>>>>>>>>>   ...............   If rec has an externally stored field we do not
>>>>>>>>>   compare it but return with value 0 if such a comparison should be
>>>>>>>>>   made.
>>>>>>>>>
>>>>>>>>> Note that blobs are externally stored fields in InnoDB, so, I
>>>>>>>>> think,
>>>>>>>>> this means that you cannot use cmp_dtuple_rec() to compare blobs.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sergei
>>>>>>>>> Chief Architect MariaDB
>>>>>>>>> and security@xxxxxxxxxxx
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: https://launchpad.net/~maria-developers
>>>>>>>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>>>>>>>> Unsubscribe : https://launchpad.net/~maria-developers
>>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

References

GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-05-30
Re: GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-06-21
Re: GSoC weekly reports (Unique indexes for blobs)
From: Jan Lindström, 2016-06-22
Re: GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-06-27
Re: GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-07-04
Re: GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-07-13
Re: GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-07-20
Re: GSoC weekly reports (Unique indexes for blobs)
From: Shubham Barai, 2016-07-27