maria-developers team mailing list archive

Thread
Date
Re: InnoDB blob for primary key

To: maria-developers@xxxxxxxxxxxxxxxxxxx
From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
Date: Tue, 19 Apr 2016 19:24:01 +0530
In-reply-to: <CANhFX6VDbVf7DOnH33F6b_paUgjYKA+aezXacDqSA4sZr9CMLw@mail.gmail.com>
Hello Sergei!
I am sorry , I pressed Ctrl + Enter so first message is an incomplete
message
Actually first I was thinking instead of storing hash as b+tree store it in
hash table
But then i come to know that we can have range query in blob fields If we
use hash table
then range query will take O(n) time. I was reading mysql manual then i
come across this
term this will monitor mysql query and if it finds that there is lot of
query using in or equal
then it will build hash index from b+tree index.I want to implement same
thing in myisam

But I am not sure that this will be good idea.What do you think sir
Currently I am reading about virtual columns if i make some progress i will
let you know
Regards
sachin

On Tue, Apr 19, 2016 at 7:10 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
wrote:

> Hello Sergei!
> By adaptive hashing I refer to something like in innodb adaptive hashing
> http://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash.html
>
>
> On Tue, Apr 19, 2016 at 6:55 PM, sachin setiya <sachinsetia1001@xxxxxxxxx>
> wrote:
>
>> Hello
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Mon, 11 Apr 2016 15:03:24 +0530
>> Subject: InnoDB blob for primary key
>> Hello Developers,
>> Hi this is sachin.Actually i was currently experimenting with with
>> blob uniques in innodb
>> there is three main problems
>> 1.Unique blob
>> 2.Blob Forigen key
>> 3.Blob primary key
>>
>> 1. For blob unique we can simply store hash in unclustered index
>> 2. Blob Forigen key i am currently working on it
>> 3. Blob primary key :- for this i thought we create a 4 byte column
>> which stores the hash of blob primary key.Internally this column will
>> work as
>> primary key and key for clustered index. I already successufully tested
>> this
>> here is my output
>>
>> MariaDB [sachin]> create table t4 (abc blob primary key);
>> Query OK, 0 rows affected (0.10 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> Query OK, 1 row affected (0.01 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> Query OK, 1 row affected (0.00 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>
>> MariaDB [sachin]> desc t4;
>> +-------+------+------+-----+---------+-------+
>> | Field | Type | Null | Key | Default | Extra |
>> +-------+------+------+-----+---------+-------+
>> | abc   | blob | NO   | PRI | NULL    |       |
>> +-------+------+------+-----+---------+-------+
>> 1 row in set (0.01 sec)
>>
>> MariaDB [sachin]> select * from t4;
>> +---------------+
>> | abc           |
>> +---------------+
>> | sachin        |
>> | sachin setiya |
>> +---------------+
>> 2 rows in set (0.01 sec)
>>
>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
>> it  does not matter either i got selected in gsoc i want to do work in
>> innodb blob unique from today.Please sir allow me to do this
>>
>> I am including the patch file and t4.ibd and t4.frm file
>> Regards
>> sachin
>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: "Jan Lindström" <jan.lindstrom@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: "maria-developers@xxxxxxxxxxxxxxxxxxx" <
>> maria-developers@xxxxxxxxxxxxxxxxxxx>
>> Date: Tue, 12 Apr 2016 14:22:29 +0300
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi,
>>
>> Sachin, naturally you may continue, this is open source, please read
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>>  From InnoDB point of view there is issue if you add a new system generated
>> column to the row. Firstly, existing tables will not have that column and
>> secondly it will effect e.g maximum row size and external tools like backup.
>>
>> R: Jan
>>
>> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx
>> > wrote:
>>
>>> Hello Developers,
>>> Hi this is sachin.Actually i was currently experimenting with with
>>> blob uniques in innodb
>>> there is three main problems
>>> 1.Unique blob
>>> 2.Blob Forigen key
>>> 3.Blob primary key
>>>
>>> 1. For blob unique we can simply store hash in unclustered index
>>> 2. Blob Forigen key i am currently working on it
>>> 3. Blob primary key :- for this i thought we create a 4 byte column
>>> which stores the hash of blob primary key.Internally this column will
>>> work as
>>> primary key and key for clustered index. I already successufully tested
>>> this
>>> here is my output
>>>
>>> MariaDB [sachin]> create table t4 (abc blob primary key);
>>> Query OK, 0 rows affected (0.10 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin');
>>> Query OK, 1 row affected (0.01 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin');
>>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>>> Query OK, 1 row affected (0.00 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>>
>>> MariaDB [sachin]> desc t4;
>>> +-------+------+------+-----+---------+-------+
>>> | Field | Type | Null | Key | Default | Extra |
>>> +-------+------+------+-----+---------+-------+
>>> | abc   | blob | NO   | PRI | NULL    |       |
>>> +-------+------+------+-----+---------+-------+
>>> 1 row in set (0.01 sec)
>>>
>>> MariaDB [sachin]> select * from t4;
>>> +---------------+
>>> | abc           |
>>> +---------------+
>>> | sachin        |
>>> | sachin setiya |
>>> +---------------+
>>> 2 rows in set (0.01 sec)
>>>
>>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>>> now it  does not matter either i got selected in gsoc i want to do work in
>>> innodb blob unique from today.Please sir allow me to do this
>>>
>>> I am including the patch file and t4.ibd and t4.frm file
>>> Regards
>>> sachin
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~maria-developers
>>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~maria-developers
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <vuvova@xxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc:
>> Date: Wed, 13 Apr 2016 00:04:22 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> Just a very quick comment now. We're having a company meeting
>> (and a meetup -
>> http://www.meetup.com/MariaDB-Developers-Berlin-Meetup/events/230026151/)
>> in Berlin, I'll send you a longer reply when this is all over (in a
>> couple of days).
>>
>> It might be an interesting option not to hack InnoDB, but to add
>> this feature (long UNIQUE keys) on the upper level. That is, in sql/
>> directory. Automatically add a new BIGINT column, but hide it from
>> the user (not show it in SHOW COLUMNS and in SELECT *). Create
>> an index for it (also hidden). Put the blob's hash into this column.
>> And on collisions retrieve both rows and compare blobs. Just like MyISAM
>> is doing, but not on the storage engine level, but in the
>> engine-independent fashion, something that works for all engines.
>>
>> Want to look at this approach?
>>
>> MySQL 5.7 has something similar (long unique constraints on the sql
>> layer), but they've done it only for internal temporary tables (like
>> MyISAM's MI_UNIQUE is only used for internal temporary tables),
>> and it's not directly available for the user. Still, it might give you
>> some hinst about how to implement this feature on the sql layer.
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>> On Apr 11, Sachin Setia wrote:
>> > Hello Developers,
>> > Hi this is sachin.Actually i was currently experimenting with with
>> > blob uniques in innodb
>> > there is three main problems
>> > 1.Unique blob
>> > 2.Blob Forigen key
>> > 3.Blob primary key
>> >
>> > 1. For blob unique we can simply store hash in unclustered index
>> > 2. Blob Forigen key i am currently working on it
>> > 3. Blob primary key :- for this i thought we create a 4 byte column
>> > which stores the hash of blob primary key.Internally this column will
>> work
>> > as
>> > primary key and key for clustered index. I already successufully tested
>> > this
>> > here is my output
>> >
>> > MariaDB [sachin]> create table t4 (abc blob primary key);
>> > Query OK, 0 rows affected (0.10 sec)
>> >
>> > MariaDB [sachin]> insert into t4 values('sachin');
>> > Query OK, 1 row affected (0.01 sec)
>> >
>> > MariaDB [sachin]> insert into t4 values('sachin');
>> > ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> > MariaDB [sachin]> insert into t4 values('sachin setiya');
>> > Query OK, 1 row affected (0.00 sec)
>> >
>> > MariaDB [sachin]> insert into t4 values('sachin setiya');
>> > ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>> >
>> > MariaDB [sachin]> desc t4;
>> > +-------+------+------+-----+---------+-------+
>> > | Field | Type | Null | Key | Default | Extra |
>> > +-------+------+------+-----+---------+-------+
>> > | abc   | blob | NO   | PRI | NULL    |       |
>> > +-------+------+------+-----+---------+-------+
>> > 1 row in set (0.01 sec)
>> >
>> > MariaDB [sachin]> select * from t4;
>> > +---------------+
>> > | abc           |
>> > +---------------+
>> > | sachin        |
>> > | sachin setiya |
>> > +---------------+
>> > 2 rows in set (0.01 sec)
>> >
>> > @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>> now
>> > it  does not matter either i got selected in gsoc i want to do work in
>> > innodb blob unique from today.Please sir allow me to do this
>> >
>> > I am including the patch file and t4.ibd and t4.frm file
>> > Regards
>> > sachin
>>
>> > diff --git a/sql/sql_table.cc b/sql/sql_table.cc
>> > index dfce503..efd6f22 100644
>> > --- a/sql/sql_table.cc
>> > +++ b/sql/sql_table.cc
>> > @@ -3244,7 +3244,7 @@ mysql_prepare_create_table(THD *thd,
>> HA_CREATE_INFO *create_info,
>> >       !(sql_field->charset= find_bin_collation(sql_field->charset)))
>> >        DBUG_RETURN(TRUE);
>> >
>> > -    /*
>> > +    /*
>> >        Convert the default value from client character
>> >        set into the column character set if necessary.
>> >      */
>> > @@ -3874,11 +3874,11 @@ mysql_prepare_create_table(THD *thd,
>> HA_CREATE_INFO *create_info,
>> >            if (f_is_geom(sql_field->pack_flag) && sql_field->geom_type
>> ==
>> >                Field::GEOM_POINT)
>> >              column->length= MAX_LEN_GEOM_POINT_FIELD;
>> > -       if (!column->length)
>> > -       {
>> > -         my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0),
>> column->field_name.str);
>> > -         DBUG_RETURN(TRUE);
>> > -       }
>> > +//     if (!column->length) //change
>> > +//     {
>> > +//       my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0),
>> column->field_name.str);
>> > +//       DBUG_RETURN(TRUE);
>> > +//     }
>> >       }
>> >  #ifdef HAVE_SPATIAL
>> >       if (key->type == Key::SPATIAL)
>> > @@ -3992,9 +3992,9 @@ mysql_prepare_create_table(THD *thd,
>> HA_CREATE_INFO *create_info,
>> >        }
>> >        else if (key_part_length == 0 && (sql_field->flags &
>> NOT_NULL_FLAG))
>> >        {
>> > -     my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
>> > -                 column->field_name.str);
>> > -       DBUG_RETURN(TRUE);
>> > +//   my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
>> > +//                 column->field_name.str);
>> > +//     DBUG_RETURN(TRUE);
>> >        }
>> >        if (key_part_length > file->max_key_part_length() &&
>> >            key->type != Key::FULLTEXT)
>> > diff --git a/storage/innobase/dict/dict0dict.cc
>> b/storage/innobase/dict/dict0dict.cc
>> > index c51deb0..f6e94af 100644
>> > --- a/storage/innobase/dict/dict0dict.cc
>> > +++ b/storage/innobase/dict/dict0dict.cc
>> > @@ -1315,7 +1315,7 @@ dict_table_add_to_cache(
>> >  #define BIG_ROW_SIZE 1024
>> >
>> >       ut_ad(mutex_own(&(dict_sys->mutex)));
>> > -
>> > +
>> >       dict_table_add_system_columns(table, heap);
>> >
>> >       table->cached = TRUE;
>> > diff --git a/storage/xtradb/dict/dict0dict.cc
>> b/storage/xtradb/dict/dict0dict.cc
>> > index 206038d..d8223d7 100644
>> > --- a/storage/xtradb/dict/dict0dict.cc
>> > +++ b/storage/xtradb/dict/dict0dict.cc
>> > @@ -1251,7 +1251,7 @@ dict_table_open_on_name(
>> >  Adds system columns to a table object. */
>> >  UNIV_INTERN
>> >  void
>> > -dict_table_add_system_columns(
>> > +dict_table_add_system_columns( //work
>> >  /*==========================*/
>> >       dict_table_t*   table,  /*!< in/out: table */
>> >       mem_heap_t*     heap)   /*!< in: temporary heap */
>> > @@ -1266,16 +1266,25 @@ dict_table_add_system_columns(
>> >       etc.) and as the last columns of the table memory object.
>> >       The clustered index will not always physically contain all
>> >       system columns. */
>> > -
>> > +//   dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
>> > +//   ,DATA_NOT_NULL,DATA_BLOB_HASH_LEN);
>> >       dict_mem_table_add_col(table, heap, "DB_ROW_ID", DATA_SYS,
>> >                              DATA_ROW_ID | DATA_NOT_NULL,
>> >                              DATA_ROW_ID_LEN);
>> >  #if DATA_ROW_ID != 0
>> >  #error "DATA_ROW_ID != 0"
>> >  #endif
>> > +//   dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_SYS,
>> > +//
>>  DATA_BLOB_HASH|DATA_NOT_NULL,
>> > +//
>>  DATA_BLOB_HASH_LEN);
>> > +     //just a rough trick to get it working
>> > +
>> > +//   if(*table->name=='y'){
>> >       dict_mem_table_add_col(table, heap, "DB_TRX_ID", DATA_SYS,
>> >                              DATA_TRX_ID | DATA_NOT_NULL,
>> >                              DATA_TRX_ID_LEN);
>> > +//   }
>> > +
>> >  #if DATA_TRX_ID != 1
>> >  #error "DATA_TRX_ID != 1"
>> >  #endif
>> > @@ -1310,7 +1319,11 @@ dict_table_add_to_cache(
>> >       ulint   row_len;
>> >
>> >       ut_ad(dict_lru_validate());
>> > -
>> > +//   bool x =false;//break
>> > +//   if(x){
>> > +//           dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
>> > +//
>>  ,DATA_NOT_NULL,4);
>> > +//   }
>> >       /* The lower limit for what we consider a "big" row */
>> >  #define BIG_ROW_SIZE 1024
>> >
>> > @@ -3075,7 +3088,7 @@ dict_index_build_internal_clust(
>> >       /* Copy the fields of index */
>> >       dict_index_copy(new_index, index, table, 0, index->n_fields);
>> >
>> > -     if (dict_index_is_univ(index)) {
>> > +     if (dict_index_is_univ(index)) { //work
>> >               /* No fixed number of fields determines an entry uniquely
>> */
>> >
>> >               new_index->n_uniq = REC_MAX_N_FIELDS;
>> > @@ -3124,7 +3137,7 @@ dict_index_build_internal_clust(
>> >                                                         DATA_ROLL_PTR),
>> >                                  0);
>> >
>> > -             for (i = 0; i < trx_id_pos; i++) {
>> > +             for (i = 0; i < trx_id_pos; i++) {//work i think i need
>> to do some stuff
>> >
>> >                       ulint fixed_size = dict_col_get_fixed_size(
>> >                               dict_index_get_nth_col(new_index, i),
>> > diff --git a/storage/xtradb/dict/dict0load.cc
>> b/storage/xtradb/dict/dict0load.cc
>> > index d6ed8ac..8cc59f8 100644
>> > --- a/storage/xtradb/dict/dict0load.cc
>> > +++ b/storage/xtradb/dict/dict0load.cc
>> > @@ -2276,7 +2276,7 @@ dictionary cache.
>> >  ibd_file_missing flag TRUE in the table object we return */
>> >  UNIV_INTERN
>> >  dict_table_t*
>> > -dict_load_table(
>> > +dict_load_table( //work for corruped table
>> >  /*============*/
>> >       const char*     name,   /*!< in: table name in the
>> >                               databasename/tablename format */
>> > @@ -2337,7 +2337,7 @@ dict_load_table(
>> >               btr_pcur_close(&pcur);
>> >               mtr_commit(&mtr);
>> >               mem_heap_free(heap);
>> > -
>> > +
>> >               return(NULL);
>> >       }
>> >
>> > diff --git a/storage/xtradb/dict/dict0mem.cc
>> b/storage/xtradb/dict/dict0mem.cc
>> > index a4f6cd6..6cbe556 100644
>> > --- a/storage/xtradb/dict/dict0mem.cc
>> > +++ b/storage/xtradb/dict/dict0mem.cc
>> > @@ -101,11 +101,11 @@ dict_mem_table_create(
>> >       memcpy(table->name, name, strlen(name) + 1);
>> >       table->is_system_db = dict_mem_table_is_system(table->name);
>> >       table->space = (unsigned int) space;
>> > -     table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS);
>> > -
>> > +     table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS); //work
>> > +
>> >       table->cols = static_cast<dict_col_t*>(
>> >               mem_heap_alloc(heap,
>> > -                            (n_cols + DATA_N_SYS_COLS)
>> > +                            (n_cols + DATA_N_SYS_COLS)//work
>> >                              * sizeof(dict_col_t)));
>> >
>> >       ut_d(table->magic_n = DICT_TABLE_MAGIC_N);
>> > diff --git a/storage/xtradb/handler/ha_innodb.cc
>> b/storage/xtradb/handler/ha_innodb.cc
>> > index e5edf76..3a8dc91 100644
>> > --- a/storage/xtradb/handler/ha_innodb.cc
>> > +++ b/storage/xtradb/handler/ha_innodb.cc
>> > @@ -5983,7 +5983,7 @@ ha_innobase::open(
>> >       ibool                   par_case_name_set = FALSE;
>> >       char                    par_case_name[FN_REFLEN];
>> >       dict_err_ignore_t       ignore_err = DICT_ERR_IGNORE_NONE;
>> > -
>> > +     ibool is_blob_primary_key=false;
>> >       DBUG_ENTER("ha_innobase::open");
>> >
>> >       UT_NOT_USED(mode);
>> > @@ -6031,13 +6031,20 @@ ha_innobase::open(
>> >
>> >       /* Get pointer to a table object in InnoDB dictionary cache */
>> >       ib_table = dict_table_open_on_name(norm_name, FALSE, TRUE,
>> ignore_err);
>> > -
>> > +     //int number_of_columns = dict_table_get_n_user_cols(ib_table);
>> > +     //if(ib_table->)
>> > +     if(ib_table){
>> > +             int number_of_columns =
>> dict_table_get_n_user_cols(ib_table);
>> > +
>>  if(!innobase_strcasecmp(ib_table->col_names,"DB_BLOB_HASH")){
>> > +                     is_blob_primary_key=true;
>> > +                     number_of_columns--;//stodo i think we need to
>> add flag for blob primary key to make checking easier
>> > +             }
>> >       if (ib_table
>> > -         && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID)
>> > -              && table->s->stored_fields !=
>> dict_table_get_n_user_cols(ib_table))
>> > +         && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID))
>> > +              && table->s->stored_fields != number_of_columns) //work
>> >               || (DICT_TF2_FLAG_IS_SET(ib_table,
>> DICT_TF2_FTS_HAS_DOC_ID)
>> >                   && (table->s->fields
>> > -                     != dict_table_get_n_user_cols(ib_table) - 1)))) {
>> > +                     != dict_table_get_n_user_cols(ib_table) - 1))) {
>> >               ib_logf(IB_LOG_LEVEL_WARN,
>> >                       "table %s contains %lu user defined columns "
>> >                       "in InnoDB, but %lu columns in MySQL. Please "
>> > @@ -6054,7 +6061,7 @@ ha_innobase::open(
>> >               ib_table = NULL;
>> >               is_part = NULL;
>> >       }
>> > -
>> > +     }
>> >       if (UNIV_UNLIKELY(ib_table && ib_table->is_corrupt &&
>> >                         srv_pass_corrupt_table <= 1)) {
>> >               free_share(share);
>> > @@ -6254,12 +6261,12 @@ ha_innobase::open(
>> >       /* Looks like MySQL-3.23 sometimes has primary key number != 0 */
>> >       primary_key = table->s->primary_key;
>> >       key_used_on_scan = primary_key;
>> > -
>> > +     if(!is_blob_primary_key){
>> >       if (!innobase_build_index_translation(table, ib_table, share)) {
>> >                 sql_print_error("Build InnoDB index translation table
>> for"
>> >                                 " Table %s failed", name);
>> >       }
>> > -
>> > +     }
>> >       /* Allocate a buffer for a 'row reference'. A row reference is
>> >       a string of bytes of length ref_length which uniquely specifies
>> >       a row in our table. Note that MySQL may also compare two row
>> > @@ -6314,7 +6321,11 @@ ha_innobase::open(
>> >                       for (uint i = 0; i < table->s->keys; i++) {
>> >                               dict_index_t*   index;
>> >                               index = innobase_get_index(i);
>> > -                             if (dict_index_is_clust(index)) {
>> > +                             if (dict_index_is_clust(index)) { //work
>> > +                             if(is_blob_primary_key){
>> > +                                     ref_length=4; //hash length
>> > +                                     continue;
>> > +                             }
>> >                                       ref_length =
>> >
>> table->key_info[i].key_length;
>> >                               }
>> > @@ -7795,7 +7806,8 @@ build_template_field(
>> >
>> >       //ut_ad(field == table->field[i]);
>> >       ut_ad(clust_index->table == index->table);
>> > -
>> > +//work here we go
>> > +//todo it should get the next column defs
>> >       col = dict_table_get_nth_col(index->table, i);
>> >
>> >       templ = prebuilt->mysql_template + prebuilt->n_template++;
>> > @@ -8138,7 +8150,8 @@ ha_innobase::build_template(
>> >                                       continue;
>> >                               }
>> >                       }
>> > -
>> > +
>>  if(!innobase_strcasecmp(clust_index->fields->name,"DB_BLOB_HASH"))
>> > +                                     i++;
>> >                       build_template_field(prebuilt, clust_index, index,
>> >                                            table, field, i);
>> >               }
>> > @@ -11065,6 +11078,7 @@ create_table_def(
>> >       ulint           doc_id_col = 0;
>> >       ibool           has_doc_id_col = FALSE;
>> >       mem_heap_t*     heap;
>> > +     bool is_blob_primary_key=false;
>> >
>> >       DBUG_ENTER("create_table_def");
>> >       DBUG_PRINT("enter", ("table_name: %s", table_name));
>> > @@ -11126,6 +11140,12 @@ create_table_def(
>> >                       table->fts->doc_col = doc_id_col;
>> >               }
>> >       } else {
>> > +             if(form->key_info[0].key_part->length==0 && //change
>> > +             form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
>> > +                     s_cols++;
>> > +                     is_blob_primary_key=true;
>> > +             }
>> > +
>> >               table = dict_mem_table_create(table_name, 0, s_cols,
>> >                                             flags, flags2);
>> >       }
>> > @@ -11143,6 +11163,13 @@ create_table_def(
>> >               table->data_dir_path = NULL;
>> >       }
>> >       heap = mem_heap_create(1000);
>> > +     //work
>> > +     //add one more column for hash
>> > +
>> > +     if(is_blob_primary_key){
>> > +             dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",
>> > +     DATA_INT,1283,4);
>> > +     }
>> >
>> >       for (i = 0; i < n_cols; i++) {
>> >               Field*  field = form->field[i];
>> > @@ -11222,7 +11249,7 @@ create_table_def(
>> >                       err = DB_ERROR;
>> >                       goto error_ret;
>> >               }
>> > -
>> > +
>> >               dict_mem_table_add_col(table, heap,
>> >                       field->field_name,
>> >                       col_type,
>> > @@ -11347,7 +11374,7 @@ create_index(
>> >               the length of the key part versus the column. */
>> >
>> >               Field*  field = NULL;
>> > -
>> > +//work
>> >               for (ulint j = 0; j < form->s->fields; j++) {
>> >
>> >                       field = form->field[j];
>> > @@ -11396,7 +11423,12 @@ create_index(
>> >               }
>> >
>> >               field_lengths[i] = key_part->length;
>> > -
>> > +             if(form->key_info[0].key_part->length==0 && //change
>> > +             form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
>> > +                     dict_mem_index_add_field(
>> > +                     index,"DB_BLOB_HASH", 0);
>> > +                     continue;
>> > +             }
>> >               dict_mem_index_add_field(
>> >                       index, key_part->field->field_name, prefix_len);
>> >       }
>> > diff --git a/storage/xtradb/include/data0type.h
>> b/storage/xtradb/include/data0type.h
>> > index 111664b..0f510e8 100644
>> > --- a/storage/xtradb/include/data0type.h
>> > +++ b/storage/xtradb/include/data0type.h
>> > @@ -147,6 +147,9 @@ be less than 256 */
>> >  #define      DATA_ROLL_PTR   2       /* rollback data pointer: 7 bytes
>> */
>> >  #define DATA_ROLL_PTR_LEN 7
>> >
>> > +//#define DATA_BLOB_HASH  3        // hash coloumn for blob primay key
>> > +//#define DATA_BLOB_HASH_LEN 4       //used as a clustered index
>> > +
>> >  #define      DATA_N_SYS_COLS 3       /* number of system columns
>> defined above */
>> >
>> >  #define DATA_FTS_DOC_ID      3       /* Used as FTS DOC ID column */
>> > diff --git a/storage/xtradb/que/que0que.cc
>> b/storage/xtradb/que/que0que.cc
>> > index e2dc023..076785d 100644
>> > --- a/storage/xtradb/que/que0que.cc
>> > +++ b/storage/xtradb/que/que0que.cc
>> > @@ -1030,7 +1030,7 @@ que_thr_step(
>> >               que_node_print_info(node);
>> >       }
>> >  #endif
>> > -     if (type & QUE_NODE_CONTROL_STAT) {
>> > +     if (type & QUE_NODE_CONTROL_STAT) { //improve
>> >               if ((thr->prev_node != que_node_get_parent(node))
>> >                   && que_node_get_next(thr->prev_node)) {
>> >
>> > diff --git a/storage/xtradb/row/row0ins.cc
>> b/storage/xtradb/row/row0ins.cc
>> > index d02e179..c58d7fb 100644
>> > --- a/storage/xtradb/row/row0ins.cc
>> > +++ b/storage/xtradb/row/row0ins.cc
>> > @@ -137,7 +137,7 @@ row_ins_alloc_sys_fields(
>> >  /*=====================*/
>> >       ins_node_t*     node)   /*!< in: insert node */
>> >  {
>> > -     dtuple_t*               row;
>> > +     dtuple_t*               row;  //work
>> >       dict_table_t*           table;
>> >       mem_heap_t*             heap;
>> >       const dict_col_t*       col;
>> > @@ -3118,7 +3118,7 @@ row_ins_index_entry_set_vals(
>> >
>> >       n_fields = dtuple_get_n_fields(entry);
>> >
>> > -     for (i = 0; i < n_fields; i++) {
>> > +     for (i = 0; i < n_fields; i++) {//see
>> >               dict_field_t*   ind_field;
>> >               dfield_t*       field;
>> >               const dfield_t* row_field;
>> > @@ -3169,8 +3169,8 @@ row_ins_index_entry_step(
>> >
>> >       ut_ad(dtuple_check_typed(node->row));
>> >
>> > -     row_ins_index_entry_set_vals(node->index, node->entry, node->row);
>> > -
>> > +     row_ins_index_entry_set_vals(node->index, node->entry,
>> node->row);//explore
>> > +//need to see who sets the entry in row
>> >       ut_ad(dtuple_check_typed(node->entry));
>> >
>> >       err = row_ins_index_entry(node->index, node->entry, thr);
>> > @@ -3208,7 +3208,7 @@ row_ins_alloc_row_id_step(
>> >
>> >       /* Fill in row id value to row */
>> >
>> > -     row_id = dict_sys_get_new_row_id();
>> > +     row_id = dict_sys_get_new_row_id(); //work
>> >
>> >       dict_sys_write_row_id(node->row_id_buf, row_id);
>> >  }
>> > diff --git a/storage/xtradb/row/row0mysql.cc
>> b/storage/xtradb/row/row0mysql.cc
>> > index 9427b20..b28e778 100644
>> > --- a/storage/xtradb/row/row0mysql.cc
>> > +++ b/storage/xtradb/row/row0mysql.cc
>> > @@ -510,7 +510,15 @@ row_mysql_store_col_in_innobase_format(
>> >
>> >       return(buf);
>> >  }
>> > -
>> > +//this is just dummy function
>> > +int dummuy_hash(byte * ptr,int length){
>> > +     int val=0;
>> > +     for(int i=0;i<length;i++){
>> > +             val =val*8+(int )*ptr;
>> > +             ptr++;
>> > +     }
>> > +     return val;
>> > +}
>> >  /**************************************************************//**
>> >  Convert a row in the MySQL format to a row in the Innobase format.
>> Note that
>> >  the function to convert a MySQL format key value to an InnoDB dtuple is
>> > @@ -532,15 +540,22 @@ row_mysql_convert_row_to_innobase(
>> >       const mysql_row_templ_t*templ;
>> >       dfield_t*               dfield;
>> >       ulint                   i;
>> > -
>> > +     ibool                   is_blob_primary_key=false;
>> > +     ibool                   is_first_template=true;
>> >       ut_ad(prebuilt->template_type == ROW_MYSQL_WHOLE_ROW);
>> > -     ut_ad(prebuilt->mysql_template);
>> > -
>> > -     for (i = 0; i < prebuilt->n_template; i++) {
>> > +     ut_ad(prebuilt->mysql_template);//work
>> > +
>>  if(!innobase_strcasecmp(prebuilt->table->col_names,"DB_BLOB_HASH")){
>> > +             is_blob_primary_key =true;
>> > +     }
>> > +     for (i = 0; i < prebuilt->n_template; i++) { //change add one
>> >
>> >               templ = prebuilt->mysql_template + i;
>> > +             if(is_blob_primary_key&&is_first_template){
>> > +                     dfield = dtuple_get_nth_field(row, i+1);
>> > +                     is_first_template=false;
>> > +             }else{
>> >               dfield = dtuple_get_nth_field(row, i);
>> > -
>> > +             }
>> >               if (templ->mysql_null_bit_mask != 0) {
>> >                       /* Column may be SQL NULL */
>> >
>> > @@ -565,6 +580,17 @@ row_mysql_convert_row_to_innobase(
>> >  next_column:
>> >               ;
>> >       }
>> > +     if(is_blob_primary_key){
>> > +             //get the first field and set  the hash
>> > +             dfield_t * hash_field=dtuple_get_nth_field(row,0);
>> > +             dfield_t * blob_field=dtuple_get_nth_field(row,1);
>> > +             byte * hash = static_cast<byte
>> *>(mem_heap_zalloc(prebuilt->heap,4));
>> > +             mach_write_to_4(hash,dummuy_hash((byte
>> *)blob_field->data,blob_field->len));
>> > +             //4 is hash length
>> > +             hash_field->len=4;
>> > +             hash_field->data=hash;
>> > +             //hash_field->type
>> > +     }
>> >
>> >       /* If there is a FTS doc id column and it is not user supplied (
>> >       generated by server) then assign it a new doc id. */
>> > @@ -831,7 +857,7 @@ row_create_prebuilt(
>> >
>> >       prebuilt->search_tuple = dtuple_create(heap,
>> search_tuple_n_fields);
>> >
>> > -     ref = dtuple_create(heap, ref_len);
>> > +     ref = dtuple_create(heap, ref_len);//work
>> >
>> >       dict_index_copy_types(ref, clust_index, ref_len);
>> >
>> > @@ -1057,7 +1083,7 @@ row_get_prebuilt_insert_row(
>> >
>> >       dict_table_copy_types(row, table);
>> >
>> > -     ins_node_set_new_row(node, row);
>> > +     ins_node_set_new_row(node, row);//explore
>> >
>> >       prebuilt->ins_graph = static_cast<que_fork_t*>(
>> >               que_node_get_parent(
>> > @@ -1353,7 +1379,7 @@ row_insert_for_mysql(
>> >       row_get_prebuilt_insert_row(prebuilt);
>> >       node = prebuilt->ins_node;
>> >
>> > -     row_mysql_convert_row_to_innobase(node->row, prebuilt, mysql_rec);
>> > +     row_mysql_convert_row_to_innobase(node->row, prebuilt,
>> mysql_rec);//debug
>> >
>> >       savept = trx_savept_take(trx);
>> >
>> > diff --git a/storage/xtradb/row/row0row.cc
>> b/storage/xtradb/row/row0row.cc
>> > index be786f9..a1dd465 100644
>> > --- a/storage/xtradb/row/row0row.cc
>> > +++ b/storage/xtradb/row/row0row.cc
>> > @@ -83,7 +83,7 @@ row_build_index_entry_low(
>> >                       entry, dict_index_get_n_unique_in_tree(index));
>> >       }
>> >
>> > -     for (i = 0; i < entry_len; i++) {
>> > +     for (i = 0; i < entry_len; i++) { //explore need to see how it
>> works for simple
>> >               const dict_field_t*     ind_field
>> >                       = dict_index_get_nth_field(index, i);
>> >               const dict_col_t*       col
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Wed, 13 Apr 2016 21:17:03 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hello Jan!
>>
>> Actually I was going through the problems raised by you
>>
>> 1>Firstly, existing tables will not have that column
>>
>> Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
>> column, but this is fine because they indexed there blob column then
>> they must have provided blob length, and if so then length of key_part
>> would not be zero and so need to make changes in patch to detect this.
>> And also by doing I can also eliminate another problem what if existing
>>  table have a coloumn  named 'DB_BLOB_HASH'
>>
>> 2>secondly it will effect e.g maximum row size
>> For this i can  increase reclength by 4 in pack_header function(but i
>> am not sure whether this will work or not)
>>
>> 3>and external tools like backup
>> Here, I am assuming that by backup you mean mysqldump actually i tried
>> this by create a table with blob coloum (length was given ) and than
>> changing mariadb binary(to changed code) and .sql file of backup was
>> working fine
>>
>> But I think there are more problems with this this prototype
>> 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
>> as written above but some code in patch compares first coloumn name
>> with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
>> in dict_table_t->flag so that we can reliable check for blob hash coloumn
>>
>> 2>This is very bad problem , what happen if if data is different and crc
>> hash of data is same.And this will occur when we have medium amount of
>> data like 100 thousand records.To avoid this we can do like
>>
>> 2.1> use crc 64 reduces probability
>> 2.2> use two different algorithm for hash and treat these two algoriths
>> hash for uniquesness.Make  collusion probability very very less
>>
>> But the main problem is what would happen if collusion occurs to solve
>> this i have two ideas
>> 2.3> like 2.2 but instead of using two algorithm use one and add column
>> for number of times this hash is repeated collusion frequency will be zero
>> i think this will work fine but need to test it
>> 2.4> whole prototype is crap start again with some different method. Can
>> you
>> please suggest some differnt way of doing this
>> Regards
>> sachin
>>
>> On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
>> <jan.lindstrom@xxxxxxxxxxx> wrote:
>> > Hi,
>> >
>> > Sachin, naturally you may continue, this is open source, please read
>> >
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>> > From InnoDB point of view there is issue if you add a new system
>> generated
>> > column to the row. Firstly, existing tables will not have that column
>> and
>> > secondly it will effect e.g maximum row size and external tools like
>> backup.
>> >
>> > R: Jan
>> >
>> > On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <
>> sachinsetia1001@xxxxxxxxx>
>> > wrote:
>> >>
>> >> Hello Developers,
>> >> Hi this is sachin.Actually i was currently experimenting with with
>> >> blob uniques in innodb
>> >> there is three main problems
>> >> 1.Unique blob
>> >> 2.Blob Forigen key
>> >> 3.Blob primary key
>> >>
>> >> 1. For blob unique we can simply store hash in unclustered index
>> >> 2. Blob Forigen key i am currently working on it
>> >> 3. Blob primary key :- for this i thought we create a 4 byte column
>> >> which stores the hash of blob primary key.Internally this column will
>> work
>> >> as
>> >> primary key and key for clustered index. I already successufully tested
>> >> this
>> >> here is my output
>> >>
>> >> MariaDB [sachin]> create table t4 (abc blob primary key);
>> >> Query OK, 0 rows affected (0.10 sec)
>> >>
>> >> MariaDB [sachin]> insert into t4 values('sachin');
>> >> Query OK, 1 row affected (0.01 sec)
>> >>
>> >> MariaDB [sachin]> insert into t4 values('sachin');
>> >> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> >> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >> Query OK, 1 row affected (0.00 sec)
>> >>
>> >> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>> >>
>> >> MariaDB [sachin]> desc t4;
>> >> +-------+------+------+-----+---------+-------+
>> >> | Field | Type | Null | Key | Default | Extra |
>> >> +-------+------+------+-----+---------+-------+
>> >> | abc   | blob | NO   | PRI | NULL    |       |
>> >> +-------+------+------+-----+---------+-------+
>> >> 1 row in set (0.01 sec)
>> >>
>> >> MariaDB [sachin]> select * from t4;
>> >> +---------------+
>> >> | abc           |
>> >> +---------------+
>> >> | sachin        |
>> >> | sachin setiya |
>> >> +---------------+
>> >> 2 rows in set (0.01 sec)
>> >>
>> >> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>> now
>> >> it  does not matter either i got selected in gsoc i want to do work in
>> >> innodb blob unique from today.Please sir allow me to do this
>> >>
>> >> I am including the patch file and t4.ibd and t4.frm file
>> >> Regards
>> >> sachin
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Mailing list: https://launchpad.net/~maria-developers
>> >> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>> >> Unsubscribe : https://launchpad.net/~maria-developers
>> >> More help   : https://help.launchpad.net/ListHelp
>> >>
>> >
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Wed, 13 Apr 2016 22:19:44 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hello Sergei
>> Sorry I did not see your mail. Actually i was thinking something like
>> this before implementing the prototype but if i am more closer to
>> innodb the more performance i will i get. I will definitively think
>> about it.
>> Regards
>> sachin
>>
>> On Wed, Apr 13, 2016 at 9:17 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>> > Hello Jan!
>> >
>> > Actually I was going through the problems raised by you
>> >
>> > 1>Firstly, existing tables will not have that column
>> >
>> > Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
>> > column, but this is fine because they indexed there blob column then
>> > they must have provided blob length, and if so then length of key_part
>> > would not be zero and so need to make changes in patch to detect this.
>> > And also by doing I can also eliminate another problem what if existing
>> >  table have a coloumn  named 'DB_BLOB_HASH'
>> >
>> > 2>secondly it will effect e.g maximum row size
>> > For this i can  increase reclength by 4 in pack_header function(but i
>> > am not sure whether this will work or not)
>> >
>> > 3>and external tools like backup
>> > Here, I am assuming that by backup you mean mysqldump actually i tried
>> > this by create a table with blob coloum (length was given ) and than
>> > changing mariadb binary(to changed code) and .sql file of backup was
>> > working fine
>> >
>> > But I think there are more problems with this this prototype
>> > 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
>> > as written above but some code in patch compares first coloumn name
>> > with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
>> > in dict_table_t->flag so that we can reliable check for blob hash
>> coloumn
>> >
>> > 2>This is very bad problem , what happen if if data is different and crc
>> > hash of data is same.And this will occur when we have medium amount of
>> > data like 100 thousand records.To avoid this we can do like
>> >
>> > 2.1> use crc 64 reduces probability
>> > 2.2> use two different algorithm for hash and treat these two algoriths
>> > hash for uniquesness.Make  collusion probability very very less
>> >
>> > But the main problem is what would happen if collusion occurs to solve
>> > this i have two ideas
>> > 2.3> like 2.2 but instead of using two algorithm use one and add column
>> > for number of times this hash is repeated collusion frequency will be
>> zero
>> > i think this will work fine but need to test it
>> > 2.4> whole prototype is crap start again with some different method.
>> Can you
>> > please suggest some differnt way of doing this
>> > Regards
>> > sachin
>> >
>> > On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
>> > <jan.lindstrom@xxxxxxxxxxx> wrote:
>> >> Hi,
>> >>
>> >> Sachin, naturally you may continue, this is open source, please read
>> >>
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>> >> From InnoDB point of view there is issue if you add a new system
>> generated
>> >> column to the row. Firstly, existing tables will not have that column
>> and
>> >> secondly it will effect e.g maximum row size and external tools like
>> backup.
>> >>
>> >> R: Jan
>> >>
>> >> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <
>> sachinsetia1001@xxxxxxxxx>
>> >> wrote:
>> >>>
>> >>> Hello Developers,
>> >>> Hi this is sachin.Actually i was currently experimenting with with
>> >>> blob uniques in innodb
>> >>> there is three main problems
>> >>> 1.Unique blob
>> >>> 2.Blob Forigen key
>> >>> 3.Blob primary key
>> >>>
>> >>> 1. For blob unique we can simply store hash in unclustered index
>> >>> 2. Blob Forigen key i am currently working on it
>> >>> 3. Blob primary key :- for this i thought we create a 4 byte column
>> >>> which stores the hash of blob primary key.Internally this column will
>> work
>> >>> as
>> >>> primary key and key for clustered index. I already successufully
>> tested
>> >>> this
>> >>> here is my output
>> >>>
>> >>> MariaDB [sachin]> create table t4 (abc blob primary key);
>> >>> Query OK, 0 rows affected (0.10 sec)
>> >>>
>> >>> MariaDB [sachin]> insert into t4 values('sachin');
>> >>> Query OK, 1 row affected (0.01 sec)
>> >>>
>> >>> MariaDB [sachin]> insert into t4 values('sachin');
>> >>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> >>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >>> Query OK, 1 row affected (0.00 sec)
>> >>>
>> >>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>> >>>
>> >>> MariaDB [sachin]> desc t4;
>> >>> +-------+------+------+-----+---------+-------+
>> >>> | Field | Type | Null | Key | Default | Extra |
>> >>> +-------+------+------+-----+---------+-------+
>> >>> | abc   | blob | NO   | PRI | NULL    |       |
>> >>> +-------+------+------+-----+---------+-------+
>> >>> 1 row in set (0.01 sec)
>> >>>
>> >>> MariaDB [sachin]> select * from t4;
>> >>> +---------------+
>> >>> | abc           |
>> >>> +---------------+
>> >>> | sachin        |
>> >>> | sachin setiya |
>> >>> +---------------+
>> >>> 2 rows in set (0.01 sec)
>> >>>
>> >>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>> now
>> >>> it  does not matter either i got selected in gsoc i want to do work in
>> >>> innodb blob unique from today.Please sir allow me to do this
>> >>>
>> >>> I am including the patch file and t4.ibd and t4.frm file
>> >>> Regards
>> >>> sachin
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> Mailing list: https://launchpad.net/~maria-developers
>> >>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>> >>> Unsubscribe : https://launchpad.net/~maria-developers
>> >>> More help   : https://help.launchpad.net/ListHelp
>> >>>
>> >>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc:
>> Date: Thu, 14 Apr 2016 20:53:51 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 13, Sachin Setia wrote:
>> > Hello Sergei
>> > Sorry I did not see your mail. Actually i was thinking something like
>> > this before implementing the prototype but if i am more closer to
>> > innodb the more performance i will i get. I will definitively think
>> > about it.
>>
>> Great!
>>
>> Could you please tell me (mailing list, that is) what you think before
>> next Monday (before April 18h, that is)?
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Fri, 15 Apr 2016 18:37:15 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi Sergei!
>>
>> Actually I was going through the mysql source code for unique long
>> constraints
>> in file sql_tmp_table.cc in function create_tmp_table they make a new
>> field
>> and a new key(hash_key) and pass this table obejct to storage
>> engine.They actually
>> refer this field as a hash field
>> On the time of insert they call bool check_unique_constraint(TABLE
>> *table) function
>> which first calculate the hash and store it in field then they see for
>> duplicate
>> hash and retrive ha_index_next_same if records are not same then record
>>
>> We can do the same thing in mariadb by adding one more field and key in
>> mysql_prepare_create_table in this we check for blob with unlimited
>> length or varchar for length
>> greater then internal storage engine by doing this in
>> mysql_prepare_create_table there
>> will be no issues of frm file inconsistance.
>>
>> In case of insert first we will fill the hash field in fill_record
>> function of sql_base.cc
>> by first calculating the hash. Then we will retrive the index map
>> using ha_index_read_map
>> if returened value is zero then we will comapare two records and if
>> they match then we will through error
>> I am not sure where to place this code either in fill_record or later
>> Or i can simple just
>> fill hash in field in fill_record and then check for duplicates later on.
>>
>> Current I am not sure how to hide columns from user.Sir, can you
>> suggest me where to look
>>
>> But there is one problem we can make unique key by this approch but
>> not primary key because primary key
>> is clustered and hashes can collide so i think we can't use hash field
>> as primary key. To overcome this problem
>> I have one idea instead of storing just hash we can make hash field
>> length 10 bytes and in last two bytes
>> we can store short int which tells how much time hash is repeated this
>> can make hash unique
>> in case of collusion. And also we are not doing more computation
>> because we already retrive
>> all records with same hashes.
>> What do you think of this idea?.
>> And there is one more problem how to make it foreign key.
>>
>> Will send you a prototype code tomorrow.
>> Regards
>> sachin
>>
>> On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>> wrote:
>> > Hi, Sachin!
>> >
>> > On Apr 13, Sachin Setia wrote:
>> >> Hello Sergei
>> >> Sorry I did not see your mail. Actually i was thinking something like
>> >> this before implementing the prototype but if i am more closer to
>> >> innodb the more performance i will i get. I will definitively think
>> >> about it.
>> >
>> > Great!
>> >
>> > Could you please tell me (mailing list, that is) what you think before
>> > next Monday (before April 18h, that is)?
>> >
>> > Regards,
>> > Sergei
>> > Chief Architect MariaDB
>> > and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Sat, 16 Apr 2016 19:37:45 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi Sergei!
>> As I already told you i was building prototype.It is some what completed
>> apart from one thing comparing of two field values. the difficulty is how
>> to get data length of field  from table->record[1]. I will try to solve
>> it.
>> One more thing actually i got how mysql hide field. For example
>> condsider three
>> fields hash,data,data. mysql field pointer point at second field not at
>> hash
>> field and hash field ptr is stored in  table->hash_field
>> can we do something similar to store hash fields(if we make array of
>> hashes in
>> case of more than one unique).But will adding member variable cause
>> problem?
>> what do you think?
>> Regards
>> sachin
>>
>> On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>> > Hi Sergei!
>> >
>> > Actually I was going through the mysql source code for unique long
>> constraints
>> > in file sql_tmp_table.cc in function create_tmp_table they make a new
>> field
>> > and a new key(hash_key) and pass this table obejct to storage
>> > engine.They actually
>> > refer this field as a hash field
>> > On the time of insert they call bool check_unique_constraint(TABLE
>> > *table) function
>> > which first calculate the hash and store it in field then they see for
>> duplicate
>> > hash and retrive ha_index_next_same if records are not same then record
>> >
>> > We can do the same thing in mariadb by adding one more field and key in
>> > mysql_prepare_create_table in this we check for blob with unlimited
>> > length or varchar for length
>> > greater then internal storage engine by doing this in
>> > mysql_prepare_create_table there
>> > will be no issues of frm file inconsistance.
>> >
>> > In case of insert first we will fill the hash field in fill_record
>> > function of sql_base.cc
>> > by first calculating the hash. Then we will retrive the index map
>> > using ha_index_read_map
>> > if returened value is zero then we will comapare two records and if
>> > they match then we will through error
>> > I am not sure where to place this code either in fill_record or later
>> > Or i can simple just
>> > fill hash in field in fill_record and then check for duplicates later
>> on.
>> >
>> > Current I am not sure how to hide columns from user.Sir, can you
>> > suggest me where to look
>> >
>> > But there is one problem we can make unique key by this approch but
>> > not primary key because primary key
>> > is clustered and hashes can collide so i think we can't use hash field
>> > as primary key. To overcome this problem
>> > I have one idea instead of storing just hash we can make hash field
>> > length 10 bytes and in last two bytes
>> > we can store short int which tells how much time hash is repeated this
>> > can make hash unique
>> > in case of collusion. And also we are not doing more computation
>> > because we already retrive
>> > all records with same hashes.
>> > What do you think of this idea?.
>> > And there is one more problem how to make it foreign key.
>> >
>> > Will send you a prototype code tomorrow.
>> > Regards
>> > sachin
>> >
>> > On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>> wrote:
>> >> Hi, Sachin!
>> >>
>> >> On Apr 13, Sachin Setia wrote:
>> >>> Hello Sergei
>> >>> Sorry I did not see your mail. Actually i was thinking something like
>> >>> this before implementing the prototype but if i am more closer to
>> >>> innodb the more performance i will i get. I will definitively think
>> >>> about it.
>> >>
>> >> Great!
>> >>
>> >> Could you please tell me (mailing list, that is) what you think before
>> >> next Monday (before April 18h, that is)?
>> >>
>> >> Regards,
>> >> Sergei
>> >> Chief Architect MariaDB
>> >> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Mon, 18 Apr 2016 20:12:39 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hello Sergei!
>> Hi Actually I was thinking about how to implement
>> blob as a foreign key.Foreign has to be unique which
>> we  can already  implement. To make it foreign key
>> we can either store unique hash or the whole blob
>> column.
>> But I am not sure much people
>> want to copy so long blob data in reference table.
>> Second option would be use blob hash as a reference key.
>> But user can not directly  us hash as a reference key
>> because that is hidden.
>> What I was thinking of a clear to programmer way of
>> using blob hash. Suppose user can directly create
>> blob hash column ,use that column as a primary key or
>> foreign key.Like
>> create table t1(abc blob , blob hash(abc))//this will just create blob
>> hash column
>> create table t1(abc blob,unique(blob hash(abc))) // this will create
>> unique blob hash column
>> and similar for primary key and foreign key
>> user can enter hash value if they have some good algorithm
>> or if they do not give any value we will automatically
>> create and store hash. What do you think? sir.
>> Regards
>> sachin
>>
>>
>>
>> On Sat, Apr 16, 2016 at 7:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>>
>>> Hi Sergei!
>>> As I already told you i was building prototype.It is some what completed
>>> apart from one thing comparing of two field values. the difficulty is how
>>> to get data length of field  from table->record[1]. I will try to solve
>>> it.
>>> One more thing actually i got how mysql hide field. For example
>>> condsider three
>>> fields hash,data,data. mysql field pointer point at second field not at
>>> hash
>>> field and hash field ptr is stored in  table->hash_field
>>> can we do something similar to store hash fields(if we make array of
>>> hashes in
>>> case of more than one unique).But will adding member variable cause
>>> problem?
>>> what do you think?
>>> Regards
>>> sachin
>>>
>>> On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>>> wrote:
>>> > Hi Sergei!
>>> >
>>> > Actually I was going through the mysql source code for unique long
>>> constraints
>>> > in file sql_tmp_table.cc in function create_tmp_table they make a new
>>> field
>>> > and a new key(hash_key) and pass this table obejct to storage
>>> > engine.They actually
>>> > refer this field as a hash field
>>> > On the time of insert they call bool check_unique_constraint(TABLE
>>> > *table) function
>>> > which first calculate the hash and store it in field then they see for
>>> duplicate
>>> > hash and retrive ha_index_next_same if records are not same then record
>>> >
>>> > We can do the same thing in mariadb by adding one more field and key in
>>> > mysql_prepare_create_table in this we check for blob with unlimited
>>> > length or varchar for length
>>> > greater then internal storage engine by doing this in
>>> > mysql_prepare_create_table there
>>> > will be no issues of frm file inconsistance.
>>> >
>>> > In case of insert first we will fill the hash field in fill_record
>>> > function of sql_base.cc
>>> > by first calculating the hash. Then we will retrive the index map
>>> > using ha_index_read_map
>>> > if returened value is zero then we will comapare two records and if
>>> > they match then we will through error
>>> > I am not sure where to place this code either in fill_record or later
>>> > Or i can simple just
>>> > fill hash in field in fill_record and then check for duplicates later
>>> on.
>>> >
>>> > Current I am not sure how to hide columns from user.Sir, can you
>>> > suggest me where to look
>>> >
>>> > But there is one problem we can make unique key by this approch but
>>> > not primary key because primary key
>>> > is clustered and hashes can collide so i think we can't use hash field
>>> > as primary key. To overcome this problem
>>> > I have one idea instead of storing just hash we can make hash field
>>> > length 10 bytes and in last two bytes
>>> > we can store short int which tells how much time hash is repeated this
>>> > can make hash unique
>>> > in case of collusion. And also we are not doing more computation
>>> > because we already retrive
>>> > all records with same hashes.
>>> > What do you think of this idea?.
>>> > And there is one more problem how to make it foreign key.
>>> >
>>> > Will send you a prototype code tomorrow.
>>> > Regards
>>> > sachin
>>> >
>>> > On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>>> wrote:
>>> >> Hi, Sachin!
>>> >>
>>> >> On Apr 13, Sachin Setia wrote:
>>> >>> Hello Sergei
>>> >>> Sorry I did not see your mail. Actually i was thinking something like
>>> >>> this before implementing the prototype but if i am more closer to
>>> >>> innodb the more performance i will i get. I will definitively think
>>> >>> about it.
>>> >>
>>> >> Great!
>>> >>
>>> >> Could you please tell me (mailing list, that is) what you think before
>>> >> next Monday (before April 18h, that is)?
>>> >>
>>> >> Regards,
>>> >> Sergei
>>> >> Chief Architect MariaDB
>>> >> and security@xxxxxxxxxxx
>>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Date: Mon, 18 Apr 2016 22:32:25 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 15, Sachin Setia wrote:
>> > Hi Sergei!
>> >
>> > Actually I was going through the mysql source code for unique long
>> > constraints in file sql_tmp_table.cc in function create_tmp_table they
>> > make a new field and a new key(hash_key) and pass this table obejct to
>> > storage engine.They actually refer this field as a hash field On the
>> > time of insert they call bool check_unique_constraint(TABLE *table)
>> > function which first calculate the hash and store it in field then
>> > they see for duplicate hash and retrive ha_index_next_same if records
>> > are not same then record
>>
>> Right. Very good!
>>
>> > We can do the same thing in mariadb by adding one more field and key in
>> > mysql_prepare_create_table in this we check for blob with unlimited
>> > length or varchar for length greater then internal storage engine by
>> > doing this in mysql_prepare_create_table there will be no issues of frm
>> > file inconsistance.
>> >
>> > In case of insert first we will fill the hash field in fill_record
>> > function of sql_base.cc by first calculating the hash. Then we will
>> > retrive the index map using ha_index_read_map if returened value is zero
>> > then we will comapare two records and if they match then we will through
>> > error I am not sure where to place this code either in fill_record or
>> > later Or i can simple just fill hash in field in fill_record and then
>> > check for duplicates later on.
>>
>> MariaDB supports "virtual columns", see
>> https://mariadb.com/kb/en/mariadb/virtual-computed-columns/
>>
>> So, it might be enough to mark this hash column as virtual, and will be
>> automatically calculated when necessary (in fill_record, etc).
>>
>> > Current I am not sure how to hide columns from user.Sir, can you suggest
>> > me where to look
>>
>> This would need a new flag per field, like "hidden". And in all commands
>> that use a list of columns (SHOW, INFORMATION_SCHEMA tables, SELECT *,
>> INSERT table VALUE (...), etc) - this column should be skipped.
>>
>> But don't worry, I don't think this will be a problem for you. You can,
>> of course, start from not hiding this column and implement this hiddden
>> flag later.
>>
>> > But there is one problem we can make unique key by this approch but not
>> > primary key because primary key is clustered and hashes can collide so i
>> > think we can't use hash field as primary key.
>>
>> Of course.
>>
>> > To overcome this problem I have one idea instead of storing just hash
>> > we can make hash field length 10 bytes and in last two bytes we can
>> > store short int which tells how much time hash is repeated this can
>> > make hash unique in case of collusion. And also we are not doing more
>> > computation because we already retrive all records with same hashes.
>> > What do you think of this idea?.  And there is one more problem how to
>> > make it foreign key.
>>
>> I would say, there is no need to bother. Let's just say that UNIQUE for
>> blobs is only a constraint, can not be used as a primary key, can not be
>> used as a foreign key. This is a reasonable limitation, I think :)
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Date: Mon, 18 Apr 2016 22:46:42 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 16, Sachin Setia wrote:
>> > Hi Sergei!
>> > As I already told you i was building prototype.It is some what
>> > completed apart from one thing comparing of two field values. the
>> > difficulty is how to get data length of field  from table->record[1].
>> > I will try to solve it.
>>
>> Do you mean, a length of the blob value?
>> Try field->get_length()
>>
>> > One more thing actually i got how mysql hide field. For example
>> > condsider three fields hash,data,data. mysql field pointer point at
>> > second field not at hash field and hash field ptr is stored in
>> > table->hash_field can we do something similar to store hash fields(if
>> > we make array of hashes in case of more than one unique).But will
>> > adding member variable cause problem?  what do you think?
>>
>> Well, yes, that would work.
>> But I see few other features where field hiding might be useful,
>> and in these cases this simple trick with moving table->field pointer
>> will not work, unfortunately.
>>
>> So we might need something more complex. But, again, this is not a most
>> important part of this project, so don't start from it.
>>
>> In fact, I'm rather interested to know about the adaptive hashing that
>> you've mentioned. Can you tell me more about it, please?
>>
>> > diff --git a/sql/sql_base.cc b/sql/sql_base.cc
>> > index ac2162b..291a3e2 100644
>> > --- a/sql/sql_base.cc
>> > +++ b/sql/sql_base.cc
>> > @@ -8956,6 +8964,31 @@ fill_record(THD *thd, TABLE *table, Field **ptr,
>> List<Item> &values,
>> >          goto err;
>> >      field->set_explicit_default(value);
>> >    }
>> > +  table->file->ha_index_init(0,0);
>> > + res =
>> table->file->ha_index_read_map(table->record[1],hash_field->ptr,HA_WHOLE_KEY,HA_READ_KEY_EXACT);
>> > +  while(!res){
>> > +       //compare the record if not sure how to compare it so just
>> assume it works
>> > +              diff = table->record[1]-table->record[0];
>> > +              src_length = blob_field->data_length();
>> > +              //dest_length =
>> blob_field->data_length(table->record[1]); // i  dont know how to get the
>> length from record 1
>> > +              // so i am enable to do this
>> > +              // then we can comapare records using
>> > +              //field->cmp_max
>> > + //this is mysql code
>> > +     /*  if (!(table->distinct ?
>> > +          table_rec_cmp(table) :
>> > +          group_rec_cmp(table->group, table->record[0],
>> table->record[1])))
>> > +      return false; // skip it
>> > +    res= table->file->ha_index_next_same(table->record[1],
>> > +                                         table->hash_field->ptr,
>> > +                                         sizeof(hash)); */
>> > +       //fetch the next record
>> > +       res= table->file->ha_index_next_same(table->record[1],
>> > +                                         hash_field->ptr,
>> > +                                         8);
>> > +
>> > +  }
>>
>> not quite, you should check unique constraints in fill_record(). This
>> should happen when the row is actually inserted or updated. A good place
>> for this check, I'd say, is handler::ha_write_row() and
>> handler::ha_update_row().
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Date: Mon, 18 Apr 2016 22:51:57 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 18, Sachin Setia wrote:
>> > Hi Actually I was thinking about how to implement blob as a foreign
>> > key.Foreign has to be unique which we  can already  implement. To make
>> > it foreign key we can either store unique hash or the whole blob
>> > column.  But I am not sure much people want to copy so long blob data
>> > in reference table.
>>
>> Agree :)
>>
>> > Second option would be use blob hash as a reference key.
>> > But user can not directly  us hash as a reference key
>> > because that is hidden.
>> > What I was thinking of a clear to programmer way of
>> > using blob hash. Suppose user can directly create
>> > blob hash column ,use that column as a primary key or
>> > foreign key.Like
>> > create table t1(abc blob , blob hash(abc))//this will just create blob
>> hash
>> > column
>> > create table t1(abc blob,unique(blob hash(abc))) // this will create
>> unique
>> > blob hash column
>> > and similar for primary key and foreign key
>> > user can enter hash value if they have some good algorithm
>> > or if they do not give any value we will automatically
>> > create and store hash. What do you think? sir.
>>
>> Mixed feelings. First, I wrote in an earlier email about using virtual
>> columns for that. In this line of thoughts, a user-specified hash
>> function is absolutely possible and logical.
>>
>> On the other hand, I don't see why anyone would need that - a hash is
>> not guaranteed to be unique, no matter what algorithm one uses.
>>
>> And, as I wrote in an earlier email, I don't think that primary/foreign
>> keys on blobs is a feature worth spending time on.
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>
Follow ups

Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-22
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-22
References

InnoDB blob for primary key
From: Sachin Setia, 2016-04-11
Re: InnoDB blob for primary key
From: Jan Lindström, 2016-04-12
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-13
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-13
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-15
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-16
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-18
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-19