maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #09510
Re: InnoDB blob for primary key
Hello Sergei!
I am sorry , I pressed Ctrl + Enter so first message is an incomplete
message
Actually first I was thinking instead of storing hash as b+tree store it in
hash table
But then i come to know that we can have range query in blob fields If we
use hash table
then range query will take O(n) time. I was reading mysql manual then i
come across this
term this will monitor mysql query and if it finds that there is lot of
query using in or equal
then it will build hash index from b+tree index.I want to implement same
thing in myisam
But I am not sure that this will be good idea.What do you think sir
Currently I am reading about virtual columns if i make some progress i will
let you know
Regards
sachin
On Tue, Apr 19, 2016 at 7:10 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
wrote:
> Hello Sergei!
> By adaptive hashing I refer to something like in innodb adaptive hashing
> http://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash.html
>
>
> On Tue, Apr 19, 2016 at 6:55 PM, sachin setiya <sachinsetia1001@xxxxxxxxx>
> wrote:
>
>> Hello
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Mon, 11 Apr 2016 15:03:24 +0530
>> Subject: InnoDB blob for primary key
>> Hello Developers,
>> Hi this is sachin.Actually i was currently experimenting with with
>> blob uniques in innodb
>> there is three main problems
>> 1.Unique blob
>> 2.Blob Forigen key
>> 3.Blob primary key
>>
>> 1. For blob unique we can simply store hash in unclustered index
>> 2. Blob Forigen key i am currently working on it
>> 3. Blob primary key :- for this i thought we create a 4 byte column
>> which stores the hash of blob primary key.Internally this column will
>> work as
>> primary key and key for clustered index. I already successufully tested
>> this
>> here is my output
>>
>> MariaDB [sachin]> create table t4 (abc blob primary key);
>> Query OK, 0 rows affected (0.10 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> Query OK, 1 row affected (0.01 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> Query OK, 1 row affected (0.00 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>
>> MariaDB [sachin]> desc t4;
>> +-------+------+------+-----+---------+-------+
>> | Field | Type | Null | Key | Default | Extra |
>> +-------+------+------+-----+---------+-------+
>> | abc | blob | NO | PRI | NULL | |
>> +-------+------+------+-----+---------+-------+
>> 1 row in set (0.01 sec)
>>
>> MariaDB [sachin]> select * from t4;
>> +---------------+
>> | abc |
>> +---------------+
>> | sachin |
>> | sachin setiya |
>> +---------------+
>> 2 rows in set (0.01 sec)
>>
>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
>> it does not matter either i got selected in gsoc i want to do work in
>> innodb blob unique from today.Please sir allow me to do this
>>
>> I am including the patch file and t4.ibd and t4.frm file
>> Regards
>> sachin
>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: "Jan Lindström" <jan.lindstrom@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: "maria-developers@xxxxxxxxxxxxxxxxxxx" <
>> maria-developers@xxxxxxxxxxxxxxxxxxx>
>> Date: Tue, 12 Apr 2016 14:22:29 +0300
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi,
>>
>> Sachin, naturally you may continue, this is open source, please read
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>> From InnoDB point of view there is issue if you add a new system generated
>> column to the row. Firstly, existing tables will not have that column and
>> secondly it will effect e.g maximum row size and external tools like backup.
>>
>> R: Jan
>>
>> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx
>> > wrote:
>>
>>> Hello Developers,
>>> Hi this is sachin.Actually i was currently experimenting with with
>>> blob uniques in innodb
>>> there is three main problems
>>> 1.Unique blob
>>> 2.Blob Forigen key
>>> 3.Blob primary key
>>>
>>> 1. For blob unique we can simply store hash in unclustered index
>>> 2. Blob Forigen key i am currently working on it
>>> 3. Blob primary key :- for this i thought we create a 4 byte column
>>> which stores the hash of blob primary key.Internally this column will
>>> work as
>>> primary key and key for clustered index. I already successufully tested
>>> this
>>> here is my output
>>>
>>> MariaDB [sachin]> create table t4 (abc blob primary key);
>>> Query OK, 0 rows affected (0.10 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin');
>>> Query OK, 1 row affected (0.01 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin');
>>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>>> Query OK, 1 row affected (0.00 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>>
>>> MariaDB [sachin]> desc t4;
>>> +-------+------+------+-----+---------+-------+
>>> | Field | Type | Null | Key | Default | Extra |
>>> +-------+------+------+-----+---------+-------+
>>> | abc | blob | NO | PRI | NULL | |
>>> +-------+------+------+-----+---------+-------+
>>> 1 row in set (0.01 sec)
>>>
>>> MariaDB [sachin]> select * from t4;
>>> +---------------+
>>> | abc |
>>> +---------------+
>>> | sachin |
>>> | sachin setiya |
>>> +---------------+
>>> 2 rows in set (0.01 sec)
>>>
>>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>>> now it does not matter either i got selected in gsoc i want to do work in
>>> innodb blob unique from today.Please sir allow me to do this
>>>
>>> I am including the patch file and t4.ibd and t4.frm file
>>> Regards
>>> sachin
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~maria-developers
>>> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~maria-developers
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <vuvova@xxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc:
>> Date: Wed, 13 Apr 2016 00:04:22 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> Just a very quick comment now. We're having a company meeting
>> (and a meetup -
>> http://www.meetup.com/MariaDB-Developers-Berlin-Meetup/events/230026151/)
>> in Berlin, I'll send you a longer reply when this is all over (in a
>> couple of days).
>>
>> It might be an interesting option not to hack InnoDB, but to add
>> this feature (long UNIQUE keys) on the upper level. That is, in sql/
>> directory. Automatically add a new BIGINT column, but hide it from
>> the user (not show it in SHOW COLUMNS and in SELECT *). Create
>> an index for it (also hidden). Put the blob's hash into this column.
>> And on collisions retrieve both rows and compare blobs. Just like MyISAM
>> is doing, but not on the storage engine level, but in the
>> engine-independent fashion, something that works for all engines.
>>
>> Want to look at this approach?
>>
>> MySQL 5.7 has something similar (long unique constraints on the sql
>> layer), but they've done it only for internal temporary tables (like
>> MyISAM's MI_UNIQUE is only used for internal temporary tables),
>> and it's not directly available for the user. Still, it might give you
>> some hinst about how to implement this feature on the sql layer.
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>> On Apr 11, Sachin Setia wrote:
>> > Hello Developers,
>> > Hi this is sachin.Actually i was currently experimenting with with
>> > blob uniques in innodb
>> > there is three main problems
>> > 1.Unique blob
>> > 2.Blob Forigen key
>> > 3.Blob primary key
>> >
>> > 1. For blob unique we can simply store hash in unclustered index
>> > 2. Blob Forigen key i am currently working on it
>> > 3. Blob primary key :- for this i thought we create a 4 byte column
>> > which stores the hash of blob primary key.Internally this column will
>> work
>> > as
>> > primary key and key for clustered index. I already successufully tested
>> > this
>> > here is my output
>> >
>> > MariaDB [sachin]> create table t4 (abc blob primary key);
>> > Query OK, 0 rows affected (0.10 sec)
>> >
>> > MariaDB [sachin]> insert into t4 values('sachin');
>> > Query OK, 1 row affected (0.01 sec)
>> >
>> > MariaDB [sachin]> insert into t4 values('sachin');
>> > ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> > MariaDB [sachin]> insert into t4 values('sachin setiya');
>> > Query OK, 1 row affected (0.00 sec)
>> >
>> > MariaDB [sachin]> insert into t4 values('sachin setiya');
>> > ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>> >
>> > MariaDB [sachin]> desc t4;
>> > +-------+------+------+-----+---------+-------+
>> > | Field | Type | Null | Key | Default | Extra |
>> > +-------+------+------+-----+---------+-------+
>> > | abc | blob | NO | PRI | NULL | |
>> > +-------+------+------+-----+---------+-------+
>> > 1 row in set (0.01 sec)
>> >
>> > MariaDB [sachin]> select * from t4;
>> > +---------------+
>> > | abc |
>> > +---------------+
>> > | sachin |
>> > | sachin setiya |
>> > +---------------+
>> > 2 rows in set (0.01 sec)
>> >
>> > @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>> now
>> > it does not matter either i got selected in gsoc i want to do work in
>> > innodb blob unique from today.Please sir allow me to do this
>> >
>> > I am including the patch file and t4.ibd and t4.frm file
>> > Regards
>> > sachin
>>
>> > diff --git a/sql/sql_table.cc b/sql/sql_table.cc
>> > index dfce503..efd6f22 100644
>> > --- a/sql/sql_table.cc
>> > +++ b/sql/sql_table.cc
>> > @@ -3244,7 +3244,7 @@ mysql_prepare_create_table(THD *thd,
>> HA_CREATE_INFO *create_info,
>> > !(sql_field->charset= find_bin_collation(sql_field->charset)))
>> > DBUG_RETURN(TRUE);
>> >
>> > - /*
>> > + /*
>> > Convert the default value from client character
>> > set into the column character set if necessary.
>> > */
>> > @@ -3874,11 +3874,11 @@ mysql_prepare_create_table(THD *thd,
>> HA_CREATE_INFO *create_info,
>> > if (f_is_geom(sql_field->pack_flag) && sql_field->geom_type
>> ==
>> > Field::GEOM_POINT)
>> > column->length= MAX_LEN_GEOM_POINT_FIELD;
>> > - if (!column->length)
>> > - {
>> > - my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0),
>> column->field_name.str);
>> > - DBUG_RETURN(TRUE);
>> > - }
>> > +// if (!column->length) //change
>> > +// {
>> > +// my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0),
>> column->field_name.str);
>> > +// DBUG_RETURN(TRUE);
>> > +// }
>> > }
>> > #ifdef HAVE_SPATIAL
>> > if (key->type == Key::SPATIAL)
>> > @@ -3992,9 +3992,9 @@ mysql_prepare_create_table(THD *thd,
>> HA_CREATE_INFO *create_info,
>> > }
>> > else if (key_part_length == 0 && (sql_field->flags &
>> NOT_NULL_FLAG))
>> > {
>> > - my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
>> > - column->field_name.str);
>> > - DBUG_RETURN(TRUE);
>> > +// my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
>> > +// column->field_name.str);
>> > +// DBUG_RETURN(TRUE);
>> > }
>> > if (key_part_length > file->max_key_part_length() &&
>> > key->type != Key::FULLTEXT)
>> > diff --git a/storage/innobase/dict/dict0dict.cc
>> b/storage/innobase/dict/dict0dict.cc
>> > index c51deb0..f6e94af 100644
>> > --- a/storage/innobase/dict/dict0dict.cc
>> > +++ b/storage/innobase/dict/dict0dict.cc
>> > @@ -1315,7 +1315,7 @@ dict_table_add_to_cache(
>> > #define BIG_ROW_SIZE 1024
>> >
>> > ut_ad(mutex_own(&(dict_sys->mutex)));
>> > -
>> > +
>> > dict_table_add_system_columns(table, heap);
>> >
>> > table->cached = TRUE;
>> > diff --git a/storage/xtradb/dict/dict0dict.cc
>> b/storage/xtradb/dict/dict0dict.cc
>> > index 206038d..d8223d7 100644
>> > --- a/storage/xtradb/dict/dict0dict.cc
>> > +++ b/storage/xtradb/dict/dict0dict.cc
>> > @@ -1251,7 +1251,7 @@ dict_table_open_on_name(
>> > Adds system columns to a table object. */
>> > UNIV_INTERN
>> > void
>> > -dict_table_add_system_columns(
>> > +dict_table_add_system_columns( //work
>> > /*==========================*/
>> > dict_table_t* table, /*!< in/out: table */
>> > mem_heap_t* heap) /*!< in: temporary heap */
>> > @@ -1266,16 +1266,25 @@ dict_table_add_system_columns(
>> > etc.) and as the last columns of the table memory object.
>> > The clustered index will not always physically contain all
>> > system columns. */
>> > -
>> > +// dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
>> > +// ,DATA_NOT_NULL,DATA_BLOB_HASH_LEN);
>> > dict_mem_table_add_col(table, heap, "DB_ROW_ID", DATA_SYS,
>> > DATA_ROW_ID | DATA_NOT_NULL,
>> > DATA_ROW_ID_LEN);
>> > #if DATA_ROW_ID != 0
>> > #error "DATA_ROW_ID != 0"
>> > #endif
>> > +// dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_SYS,
>> > +//
>> DATA_BLOB_HASH|DATA_NOT_NULL,
>> > +//
>> DATA_BLOB_HASH_LEN);
>> > + //just a rough trick to get it working
>> > +
>> > +// if(*table->name=='y'){
>> > dict_mem_table_add_col(table, heap, "DB_TRX_ID", DATA_SYS,
>> > DATA_TRX_ID | DATA_NOT_NULL,
>> > DATA_TRX_ID_LEN);
>> > +// }
>> > +
>> > #if DATA_TRX_ID != 1
>> > #error "DATA_TRX_ID != 1"
>> > #endif
>> > @@ -1310,7 +1319,11 @@ dict_table_add_to_cache(
>> > ulint row_len;
>> >
>> > ut_ad(dict_lru_validate());
>> > -
>> > +// bool x =false;//break
>> > +// if(x){
>> > +// dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
>> > +//
>> ,DATA_NOT_NULL,4);
>> > +// }
>> > /* The lower limit for what we consider a "big" row */
>> > #define BIG_ROW_SIZE 1024
>> >
>> > @@ -3075,7 +3088,7 @@ dict_index_build_internal_clust(
>> > /* Copy the fields of index */
>> > dict_index_copy(new_index, index, table, 0, index->n_fields);
>> >
>> > - if (dict_index_is_univ(index)) {
>> > + if (dict_index_is_univ(index)) { //work
>> > /* No fixed number of fields determines an entry uniquely
>> */
>> >
>> > new_index->n_uniq = REC_MAX_N_FIELDS;
>> > @@ -3124,7 +3137,7 @@ dict_index_build_internal_clust(
>> > DATA_ROLL_PTR),
>> > 0);
>> >
>> > - for (i = 0; i < trx_id_pos; i++) {
>> > + for (i = 0; i < trx_id_pos; i++) {//work i think i need
>> to do some stuff
>> >
>> > ulint fixed_size = dict_col_get_fixed_size(
>> > dict_index_get_nth_col(new_index, i),
>> > diff --git a/storage/xtradb/dict/dict0load.cc
>> b/storage/xtradb/dict/dict0load.cc
>> > index d6ed8ac..8cc59f8 100644
>> > --- a/storage/xtradb/dict/dict0load.cc
>> > +++ b/storage/xtradb/dict/dict0load.cc
>> > @@ -2276,7 +2276,7 @@ dictionary cache.
>> > ibd_file_missing flag TRUE in the table object we return */
>> > UNIV_INTERN
>> > dict_table_t*
>> > -dict_load_table(
>> > +dict_load_table( //work for corruped table
>> > /*============*/
>> > const char* name, /*!< in: table name in the
>> > databasename/tablename format */
>> > @@ -2337,7 +2337,7 @@ dict_load_table(
>> > btr_pcur_close(&pcur);
>> > mtr_commit(&mtr);
>> > mem_heap_free(heap);
>> > -
>> > +
>> > return(NULL);
>> > }
>> >
>> > diff --git a/storage/xtradb/dict/dict0mem.cc
>> b/storage/xtradb/dict/dict0mem.cc
>> > index a4f6cd6..6cbe556 100644
>> > --- a/storage/xtradb/dict/dict0mem.cc
>> > +++ b/storage/xtradb/dict/dict0mem.cc
>> > @@ -101,11 +101,11 @@ dict_mem_table_create(
>> > memcpy(table->name, name, strlen(name) + 1);
>> > table->is_system_db = dict_mem_table_is_system(table->name);
>> > table->space = (unsigned int) space;
>> > - table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS);
>> > -
>> > + table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS); //work
>> > +
>> > table->cols = static_cast<dict_col_t*>(
>> > mem_heap_alloc(heap,
>> > - (n_cols + DATA_N_SYS_COLS)
>> > + (n_cols + DATA_N_SYS_COLS)//work
>> > * sizeof(dict_col_t)));
>> >
>> > ut_d(table->magic_n = DICT_TABLE_MAGIC_N);
>> > diff --git a/storage/xtradb/handler/ha_innodb.cc
>> b/storage/xtradb/handler/ha_innodb.cc
>> > index e5edf76..3a8dc91 100644
>> > --- a/storage/xtradb/handler/ha_innodb.cc
>> > +++ b/storage/xtradb/handler/ha_innodb.cc
>> > @@ -5983,7 +5983,7 @@ ha_innobase::open(
>> > ibool par_case_name_set = FALSE;
>> > char par_case_name[FN_REFLEN];
>> > dict_err_ignore_t ignore_err = DICT_ERR_IGNORE_NONE;
>> > -
>> > + ibool is_blob_primary_key=false;
>> > DBUG_ENTER("ha_innobase::open");
>> >
>> > UT_NOT_USED(mode);
>> > @@ -6031,13 +6031,20 @@ ha_innobase::open(
>> >
>> > /* Get pointer to a table object in InnoDB dictionary cache */
>> > ib_table = dict_table_open_on_name(norm_name, FALSE, TRUE,
>> ignore_err);
>> > -
>> > + //int number_of_columns = dict_table_get_n_user_cols(ib_table);
>> > + //if(ib_table->)
>> > + if(ib_table){
>> > + int number_of_columns =
>> dict_table_get_n_user_cols(ib_table);
>> > +
>> if(!innobase_strcasecmp(ib_table->col_names,"DB_BLOB_HASH")){
>> > + is_blob_primary_key=true;
>> > + number_of_columns--;//stodo i think we need to
>> add flag for blob primary key to make checking easier
>> > + }
>> > if (ib_table
>> > - && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID)
>> > - && table->s->stored_fields !=
>> dict_table_get_n_user_cols(ib_table))
>> > + && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID))
>> > + && table->s->stored_fields != number_of_columns) //work
>> > || (DICT_TF2_FLAG_IS_SET(ib_table,
>> DICT_TF2_FTS_HAS_DOC_ID)
>> > && (table->s->fields
>> > - != dict_table_get_n_user_cols(ib_table) - 1)))) {
>> > + != dict_table_get_n_user_cols(ib_table) - 1))) {
>> > ib_logf(IB_LOG_LEVEL_WARN,
>> > "table %s contains %lu user defined columns "
>> > "in InnoDB, but %lu columns in MySQL. Please "
>> > @@ -6054,7 +6061,7 @@ ha_innobase::open(
>> > ib_table = NULL;
>> > is_part = NULL;
>> > }
>> > -
>> > + }
>> > if (UNIV_UNLIKELY(ib_table && ib_table->is_corrupt &&
>> > srv_pass_corrupt_table <= 1)) {
>> > free_share(share);
>> > @@ -6254,12 +6261,12 @@ ha_innobase::open(
>> > /* Looks like MySQL-3.23 sometimes has primary key number != 0 */
>> > primary_key = table->s->primary_key;
>> > key_used_on_scan = primary_key;
>> > -
>> > + if(!is_blob_primary_key){
>> > if (!innobase_build_index_translation(table, ib_table, share)) {
>> > sql_print_error("Build InnoDB index translation table
>> for"
>> > " Table %s failed", name);
>> > }
>> > -
>> > + }
>> > /* Allocate a buffer for a 'row reference'. A row reference is
>> > a string of bytes of length ref_length which uniquely specifies
>> > a row in our table. Note that MySQL may also compare two row
>> > @@ -6314,7 +6321,11 @@ ha_innobase::open(
>> > for (uint i = 0; i < table->s->keys; i++) {
>> > dict_index_t* index;
>> > index = innobase_get_index(i);
>> > - if (dict_index_is_clust(index)) {
>> > + if (dict_index_is_clust(index)) { //work
>> > + if(is_blob_primary_key){
>> > + ref_length=4; //hash length
>> > + continue;
>> > + }
>> > ref_length =
>> >
>> table->key_info[i].key_length;
>> > }
>> > @@ -7795,7 +7806,8 @@ build_template_field(
>> >
>> > //ut_ad(field == table->field[i]);
>> > ut_ad(clust_index->table == index->table);
>> > -
>> > +//work here we go
>> > +//todo it should get the next column defs
>> > col = dict_table_get_nth_col(index->table, i);
>> >
>> > templ = prebuilt->mysql_template + prebuilt->n_template++;
>> > @@ -8138,7 +8150,8 @@ ha_innobase::build_template(
>> > continue;
>> > }
>> > }
>> > -
>> > +
>> if(!innobase_strcasecmp(clust_index->fields->name,"DB_BLOB_HASH"))
>> > + i++;
>> > build_template_field(prebuilt, clust_index, index,
>> > table, field, i);
>> > }
>> > @@ -11065,6 +11078,7 @@ create_table_def(
>> > ulint doc_id_col = 0;
>> > ibool has_doc_id_col = FALSE;
>> > mem_heap_t* heap;
>> > + bool is_blob_primary_key=false;
>> >
>> > DBUG_ENTER("create_table_def");
>> > DBUG_PRINT("enter", ("table_name: %s", table_name));
>> > @@ -11126,6 +11140,12 @@ create_table_def(
>> > table->fts->doc_col = doc_id_col;
>> > }
>> > } else {
>> > + if(form->key_info[0].key_part->length==0 && //change
>> > + form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
>> > + s_cols++;
>> > + is_blob_primary_key=true;
>> > + }
>> > +
>> > table = dict_mem_table_create(table_name, 0, s_cols,
>> > flags, flags2);
>> > }
>> > @@ -11143,6 +11163,13 @@ create_table_def(
>> > table->data_dir_path = NULL;
>> > }
>> > heap = mem_heap_create(1000);
>> > + //work
>> > + //add one more column for hash
>> > +
>> > + if(is_blob_primary_key){
>> > + dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",
>> > + DATA_INT,1283,4);
>> > + }
>> >
>> > for (i = 0; i < n_cols; i++) {
>> > Field* field = form->field[i];
>> > @@ -11222,7 +11249,7 @@ create_table_def(
>> > err = DB_ERROR;
>> > goto error_ret;
>> > }
>> > -
>> > +
>> > dict_mem_table_add_col(table, heap,
>> > field->field_name,
>> > col_type,
>> > @@ -11347,7 +11374,7 @@ create_index(
>> > the length of the key part versus the column. */
>> >
>> > Field* field = NULL;
>> > -
>> > +//work
>> > for (ulint j = 0; j < form->s->fields; j++) {
>> >
>> > field = form->field[j];
>> > @@ -11396,7 +11423,12 @@ create_index(
>> > }
>> >
>> > field_lengths[i] = key_part->length;
>> > -
>> > + if(form->key_info[0].key_part->length==0 && //change
>> > + form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
>> > + dict_mem_index_add_field(
>> > + index,"DB_BLOB_HASH", 0);
>> > + continue;
>> > + }
>> > dict_mem_index_add_field(
>> > index, key_part->field->field_name, prefix_len);
>> > }
>> > diff --git a/storage/xtradb/include/data0type.h
>> b/storage/xtradb/include/data0type.h
>> > index 111664b..0f510e8 100644
>> > --- a/storage/xtradb/include/data0type.h
>> > +++ b/storage/xtradb/include/data0type.h
>> > @@ -147,6 +147,9 @@ be less than 256 */
>> > #define DATA_ROLL_PTR 2 /* rollback data pointer: 7 bytes
>> */
>> > #define DATA_ROLL_PTR_LEN 7
>> >
>> > +//#define DATA_BLOB_HASH 3 // hash coloumn for blob primay key
>> > +//#define DATA_BLOB_HASH_LEN 4 //used as a clustered index
>> > +
>> > #define DATA_N_SYS_COLS 3 /* number of system columns
>> defined above */
>> >
>> > #define DATA_FTS_DOC_ID 3 /* Used as FTS DOC ID column */
>> > diff --git a/storage/xtradb/que/que0que.cc
>> b/storage/xtradb/que/que0que.cc
>> > index e2dc023..076785d 100644
>> > --- a/storage/xtradb/que/que0que.cc
>> > +++ b/storage/xtradb/que/que0que.cc
>> > @@ -1030,7 +1030,7 @@ que_thr_step(
>> > que_node_print_info(node);
>> > }
>> > #endif
>> > - if (type & QUE_NODE_CONTROL_STAT) {
>> > + if (type & QUE_NODE_CONTROL_STAT) { //improve
>> > if ((thr->prev_node != que_node_get_parent(node))
>> > && que_node_get_next(thr->prev_node)) {
>> >
>> > diff --git a/storage/xtradb/row/row0ins.cc
>> b/storage/xtradb/row/row0ins.cc
>> > index d02e179..c58d7fb 100644
>> > --- a/storage/xtradb/row/row0ins.cc
>> > +++ b/storage/xtradb/row/row0ins.cc
>> > @@ -137,7 +137,7 @@ row_ins_alloc_sys_fields(
>> > /*=====================*/
>> > ins_node_t* node) /*!< in: insert node */
>> > {
>> > - dtuple_t* row;
>> > + dtuple_t* row; //work
>> > dict_table_t* table;
>> > mem_heap_t* heap;
>> > const dict_col_t* col;
>> > @@ -3118,7 +3118,7 @@ row_ins_index_entry_set_vals(
>> >
>> > n_fields = dtuple_get_n_fields(entry);
>> >
>> > - for (i = 0; i < n_fields; i++) {
>> > + for (i = 0; i < n_fields; i++) {//see
>> > dict_field_t* ind_field;
>> > dfield_t* field;
>> > const dfield_t* row_field;
>> > @@ -3169,8 +3169,8 @@ row_ins_index_entry_step(
>> >
>> > ut_ad(dtuple_check_typed(node->row));
>> >
>> > - row_ins_index_entry_set_vals(node->index, node->entry, node->row);
>> > -
>> > + row_ins_index_entry_set_vals(node->index, node->entry,
>> node->row);//explore
>> > +//need to see who sets the entry in row
>> > ut_ad(dtuple_check_typed(node->entry));
>> >
>> > err = row_ins_index_entry(node->index, node->entry, thr);
>> > @@ -3208,7 +3208,7 @@ row_ins_alloc_row_id_step(
>> >
>> > /* Fill in row id value to row */
>> >
>> > - row_id = dict_sys_get_new_row_id();
>> > + row_id = dict_sys_get_new_row_id(); //work
>> >
>> > dict_sys_write_row_id(node->row_id_buf, row_id);
>> > }
>> > diff --git a/storage/xtradb/row/row0mysql.cc
>> b/storage/xtradb/row/row0mysql.cc
>> > index 9427b20..b28e778 100644
>> > --- a/storage/xtradb/row/row0mysql.cc
>> > +++ b/storage/xtradb/row/row0mysql.cc
>> > @@ -510,7 +510,15 @@ row_mysql_store_col_in_innobase_format(
>> >
>> > return(buf);
>> > }
>> > -
>> > +//this is just dummy function
>> > +int dummuy_hash(byte * ptr,int length){
>> > + int val=0;
>> > + for(int i=0;i<length;i++){
>> > + val =val*8+(int )*ptr;
>> > + ptr++;
>> > + }
>> > + return val;
>> > +}
>> > /**************************************************************//**
>> > Convert a row in the MySQL format to a row in the Innobase format.
>> Note that
>> > the function to convert a MySQL format key value to an InnoDB dtuple is
>> > @@ -532,15 +540,22 @@ row_mysql_convert_row_to_innobase(
>> > const mysql_row_templ_t*templ;
>> > dfield_t* dfield;
>> > ulint i;
>> > -
>> > + ibool is_blob_primary_key=false;
>> > + ibool is_first_template=true;
>> > ut_ad(prebuilt->template_type == ROW_MYSQL_WHOLE_ROW);
>> > - ut_ad(prebuilt->mysql_template);
>> > -
>> > - for (i = 0; i < prebuilt->n_template; i++) {
>> > + ut_ad(prebuilt->mysql_template);//work
>> > +
>> if(!innobase_strcasecmp(prebuilt->table->col_names,"DB_BLOB_HASH")){
>> > + is_blob_primary_key =true;
>> > + }
>> > + for (i = 0; i < prebuilt->n_template; i++) { //change add one
>> >
>> > templ = prebuilt->mysql_template + i;
>> > + if(is_blob_primary_key&&is_first_template){
>> > + dfield = dtuple_get_nth_field(row, i+1);
>> > + is_first_template=false;
>> > + }else{
>> > dfield = dtuple_get_nth_field(row, i);
>> > -
>> > + }
>> > if (templ->mysql_null_bit_mask != 0) {
>> > /* Column may be SQL NULL */
>> >
>> > @@ -565,6 +580,17 @@ row_mysql_convert_row_to_innobase(
>> > next_column:
>> > ;
>> > }
>> > + if(is_blob_primary_key){
>> > + //get the first field and set the hash
>> > + dfield_t * hash_field=dtuple_get_nth_field(row,0);
>> > + dfield_t * blob_field=dtuple_get_nth_field(row,1);
>> > + byte * hash = static_cast<byte
>> *>(mem_heap_zalloc(prebuilt->heap,4));
>> > + mach_write_to_4(hash,dummuy_hash((byte
>> *)blob_field->data,blob_field->len));
>> > + //4 is hash length
>> > + hash_field->len=4;
>> > + hash_field->data=hash;
>> > + //hash_field->type
>> > + }
>> >
>> > /* If there is a FTS doc id column and it is not user supplied (
>> > generated by server) then assign it a new doc id. */
>> > @@ -831,7 +857,7 @@ row_create_prebuilt(
>> >
>> > prebuilt->search_tuple = dtuple_create(heap,
>> search_tuple_n_fields);
>> >
>> > - ref = dtuple_create(heap, ref_len);
>> > + ref = dtuple_create(heap, ref_len);//work
>> >
>> > dict_index_copy_types(ref, clust_index, ref_len);
>> >
>> > @@ -1057,7 +1083,7 @@ row_get_prebuilt_insert_row(
>> >
>> > dict_table_copy_types(row, table);
>> >
>> > - ins_node_set_new_row(node, row);
>> > + ins_node_set_new_row(node, row);//explore
>> >
>> > prebuilt->ins_graph = static_cast<que_fork_t*>(
>> > que_node_get_parent(
>> > @@ -1353,7 +1379,7 @@ row_insert_for_mysql(
>> > row_get_prebuilt_insert_row(prebuilt);
>> > node = prebuilt->ins_node;
>> >
>> > - row_mysql_convert_row_to_innobase(node->row, prebuilt, mysql_rec);
>> > + row_mysql_convert_row_to_innobase(node->row, prebuilt,
>> mysql_rec);//debug
>> >
>> > savept = trx_savept_take(trx);
>> >
>> > diff --git a/storage/xtradb/row/row0row.cc
>> b/storage/xtradb/row/row0row.cc
>> > index be786f9..a1dd465 100644
>> > --- a/storage/xtradb/row/row0row.cc
>> > +++ b/storage/xtradb/row/row0row.cc
>> > @@ -83,7 +83,7 @@ row_build_index_entry_low(
>> > entry, dict_index_get_n_unique_in_tree(index));
>> > }
>> >
>> > - for (i = 0; i < entry_len; i++) {
>> > + for (i = 0; i < entry_len; i++) { //explore need to see how it
>> works for simple
>> > const dict_field_t* ind_field
>> > = dict_index_get_nth_field(index, i);
>> > const dict_col_t* col
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Wed, 13 Apr 2016 21:17:03 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hello Jan!
>>
>> Actually I was going through the problems raised by you
>>
>> 1>Firstly, existing tables will not have that column
>>
>> Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
>> column, but this is fine because they indexed there blob column then
>> they must have provided blob length, and if so then length of key_part
>> would not be zero and so need to make changes in patch to detect this.
>> And also by doing I can also eliminate another problem what if existing
>> table have a coloumn named 'DB_BLOB_HASH'
>>
>> 2>secondly it will effect e.g maximum row size
>> For this i can increase reclength by 4 in pack_header function(but i
>> am not sure whether this will work or not)
>>
>> 3>and external tools like backup
>> Here, I am assuming that by backup you mean mysqldump actually i tried
>> this by create a table with blob coloum (length was given ) and than
>> changing mariadb binary(to changed code) and .sql file of backup was
>> working fine
>>
>> But I think there are more problems with this this prototype
>> 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
>> as written above but some code in patch compares first coloumn name
>> with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
>> in dict_table_t->flag so that we can reliable check for blob hash coloumn
>>
>> 2>This is very bad problem , what happen if if data is different and crc
>> hash of data is same.And this will occur when we have medium amount of
>> data like 100 thousand records.To avoid this we can do like
>>
>> 2.1> use crc 64 reduces probability
>> 2.2> use two different algorithm for hash and treat these two algoriths
>> hash for uniquesness.Make collusion probability very very less
>>
>> But the main problem is what would happen if collusion occurs to solve
>> this i have two ideas
>> 2.3> like 2.2 but instead of using two algorithm use one and add column
>> for number of times this hash is repeated collusion frequency will be zero
>> i think this will work fine but need to test it
>> 2.4> whole prototype is crap start again with some different method. Can
>> you
>> please suggest some differnt way of doing this
>> Regards
>> sachin
>>
>> On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
>> <jan.lindstrom@xxxxxxxxxxx> wrote:
>> > Hi,
>> >
>> > Sachin, naturally you may continue, this is open source, please read
>> >
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>> > From InnoDB point of view there is issue if you add a new system
>> generated
>> > column to the row. Firstly, existing tables will not have that column
>> and
>> > secondly it will effect e.g maximum row size and external tools like
>> backup.
>> >
>> > R: Jan
>> >
>> > On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <
>> sachinsetia1001@xxxxxxxxx>
>> > wrote:
>> >>
>> >> Hello Developers,
>> >> Hi this is sachin.Actually i was currently experimenting with with
>> >> blob uniques in innodb
>> >> there is three main problems
>> >> 1.Unique blob
>> >> 2.Blob Forigen key
>> >> 3.Blob primary key
>> >>
>> >> 1. For blob unique we can simply store hash in unclustered index
>> >> 2. Blob Forigen key i am currently working on it
>> >> 3. Blob primary key :- for this i thought we create a 4 byte column
>> >> which stores the hash of blob primary key.Internally this column will
>> work
>> >> as
>> >> primary key and key for clustered index. I already successufully tested
>> >> this
>> >> here is my output
>> >>
>> >> MariaDB [sachin]> create table t4 (abc blob primary key);
>> >> Query OK, 0 rows affected (0.10 sec)
>> >>
>> >> MariaDB [sachin]> insert into t4 values('sachin');
>> >> Query OK, 1 row affected (0.01 sec)
>> >>
>> >> MariaDB [sachin]> insert into t4 values('sachin');
>> >> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> >> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >> Query OK, 1 row affected (0.00 sec)
>> >>
>> >> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>> >>
>> >> MariaDB [sachin]> desc t4;
>> >> +-------+------+------+-----+---------+-------+
>> >> | Field | Type | Null | Key | Default | Extra |
>> >> +-------+------+------+-----+---------+-------+
>> >> | abc | blob | NO | PRI | NULL | |
>> >> +-------+------+------+-----+---------+-------+
>> >> 1 row in set (0.01 sec)
>> >>
>> >> MariaDB [sachin]> select * from t4;
>> >> +---------------+
>> >> | abc |
>> >> +---------------+
>> >> | sachin |
>> >> | sachin setiya |
>> >> +---------------+
>> >> 2 rows in set (0.01 sec)
>> >>
>> >> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>> now
>> >> it does not matter either i got selected in gsoc i want to do work in
>> >> innodb blob unique from today.Please sir allow me to do this
>> >>
>> >> I am including the patch file and t4.ibd and t4.frm file
>> >> Regards
>> >> sachin
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Mailing list: https://launchpad.net/~maria-developers
>> >> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
>> >> Unsubscribe : https://launchpad.net/~maria-developers
>> >> More help : https://help.launchpad.net/ListHelp
>> >>
>> >
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Wed, 13 Apr 2016 22:19:44 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hello Sergei
>> Sorry I did not see your mail. Actually i was thinking something like
>> this before implementing the prototype but if i am more closer to
>> innodb the more performance i will i get. I will definitively think
>> about it.
>> Regards
>> sachin
>>
>> On Wed, Apr 13, 2016 at 9:17 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>> > Hello Jan!
>> >
>> > Actually I was going through the problems raised by you
>> >
>> > 1>Firstly, existing tables will not have that column
>> >
>> > Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
>> > column, but this is fine because they indexed there blob column then
>> > they must have provided blob length, and if so then length of key_part
>> > would not be zero and so need to make changes in patch to detect this.
>> > And also by doing I can also eliminate another problem what if existing
>> > table have a coloumn named 'DB_BLOB_HASH'
>> >
>> > 2>secondly it will effect e.g maximum row size
>> > For this i can increase reclength by 4 in pack_header function(but i
>> > am not sure whether this will work or not)
>> >
>> > 3>and external tools like backup
>> > Here, I am assuming that by backup you mean mysqldump actually i tried
>> > this by create a table with blob coloum (length was given ) and than
>> > changing mariadb binary(to changed code) and .sql file of backup was
>> > working fine
>> >
>> > But I think there are more problems with this this prototype
>> > 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
>> > as written above but some code in patch compares first coloumn name
>> > with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
>> > in dict_table_t->flag so that we can reliable check for blob hash
>> coloumn
>> >
>> > 2>This is very bad problem , what happen if if data is different and crc
>> > hash of data is same.And this will occur when we have medium amount of
>> > data like 100 thousand records.To avoid this we can do like
>> >
>> > 2.1> use crc 64 reduces probability
>> > 2.2> use two different algorithm for hash and treat these two algoriths
>> > hash for uniquesness.Make collusion probability very very less
>> >
>> > But the main problem is what would happen if collusion occurs to solve
>> > this i have two ideas
>> > 2.3> like 2.2 but instead of using two algorithm use one and add column
>> > for number of times this hash is repeated collusion frequency will be
>> zero
>> > i think this will work fine but need to test it
>> > 2.4> whole prototype is crap start again with some different method.
>> Can you
>> > please suggest some differnt way of doing this
>> > Regards
>> > sachin
>> >
>> > On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
>> > <jan.lindstrom@xxxxxxxxxxx> wrote:
>> >> Hi,
>> >>
>> >> Sachin, naturally you may continue, this is open source, please read
>> >>
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>> >> From InnoDB point of view there is issue if you add a new system
>> generated
>> >> column to the row. Firstly, existing tables will not have that column
>> and
>> >> secondly it will effect e.g maximum row size and external tools like
>> backup.
>> >>
>> >> R: Jan
>> >>
>> >> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <
>> sachinsetia1001@xxxxxxxxx>
>> >> wrote:
>> >>>
>> >>> Hello Developers,
>> >>> Hi this is sachin.Actually i was currently experimenting with with
>> >>> blob uniques in innodb
>> >>> there is three main problems
>> >>> 1.Unique blob
>> >>> 2.Blob Forigen key
>> >>> 3.Blob primary key
>> >>>
>> >>> 1. For blob unique we can simply store hash in unclustered index
>> >>> 2. Blob Forigen key i am currently working on it
>> >>> 3. Blob primary key :- for this i thought we create a 4 byte column
>> >>> which stores the hash of blob primary key.Internally this column will
>> work
>> >>> as
>> >>> primary key and key for clustered index. I already successufully
>> tested
>> >>> this
>> >>> here is my output
>> >>>
>> >>> MariaDB [sachin]> create table t4 (abc blob primary key);
>> >>> Query OK, 0 rows affected (0.10 sec)
>> >>>
>> >>> MariaDB [sachin]> insert into t4 values('sachin');
>> >>> Query OK, 1 row affected (0.01 sec)
>> >>>
>> >>> MariaDB [sachin]> insert into t4 values('sachin');
>> >>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> >>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >>> Query OK, 1 row affected (0.00 sec)
>> >>>
>> >>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> >>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>> >>>
>> >>> MariaDB [sachin]> desc t4;
>> >>> +-------+------+------+-----+---------+-------+
>> >>> | Field | Type | Null | Key | Default | Extra |
>> >>> +-------+------+------+-----+---------+-------+
>> >>> | abc | blob | NO | PRI | NULL | |
>> >>> +-------+------+------+-----+---------+-------+
>> >>> 1 row in set (0.01 sec)
>> >>>
>> >>> MariaDB [sachin]> select * from t4;
>> >>> +---------------+
>> >>> | abc |
>> >>> +---------------+
>> >>> | sachin |
>> >>> | sachin setiya |
>> >>> +---------------+
>> >>> 2 rows in set (0.01 sec)
>> >>>
>> >>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
>> now
>> >>> it does not matter either i got selected in gsoc i want to do work in
>> >>> innodb blob unique from today.Please sir allow me to do this
>> >>>
>> >>> I am including the patch file and t4.ibd and t4.frm file
>> >>> Regards
>> >>> sachin
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> Mailing list: https://launchpad.net/~maria-developers
>> >>> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
>> >>> Unsubscribe : https://launchpad.net/~maria-developers
>> >>> More help : https://help.launchpad.net/ListHelp
>> >>>
>> >>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc:
>> Date: Thu, 14 Apr 2016 20:53:51 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 13, Sachin Setia wrote:
>> > Hello Sergei
>> > Sorry I did not see your mail. Actually i was thinking something like
>> > this before implementing the prototype but if i am more closer to
>> > innodb the more performance i will i get. I will definitively think
>> > about it.
>>
>> Great!
>>
>> Could you please tell me (mailing list, that is) what you think before
>> next Monday (before April 18h, that is)?
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Fri, 15 Apr 2016 18:37:15 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi Sergei!
>>
>> Actually I was going through the mysql source code for unique long
>> constraints
>> in file sql_tmp_table.cc in function create_tmp_table they make a new
>> field
>> and a new key(hash_key) and pass this table obejct to storage
>> engine.They actually
>> refer this field as a hash field
>> On the time of insert they call bool check_unique_constraint(TABLE
>> *table) function
>> which first calculate the hash and store it in field then they see for
>> duplicate
>> hash and retrive ha_index_next_same if records are not same then record
>>
>> We can do the same thing in mariadb by adding one more field and key in
>> mysql_prepare_create_table in this we check for blob with unlimited
>> length or varchar for length
>> greater then internal storage engine by doing this in
>> mysql_prepare_create_table there
>> will be no issues of frm file inconsistance.
>>
>> In case of insert first we will fill the hash field in fill_record
>> function of sql_base.cc
>> by first calculating the hash. Then we will retrive the index map
>> using ha_index_read_map
>> if returened value is zero then we will comapare two records and if
>> they match then we will through error
>> I am not sure where to place this code either in fill_record or later
>> Or i can simple just
>> fill hash in field in fill_record and then check for duplicates later on.
>>
>> Current I am not sure how to hide columns from user.Sir, can you
>> suggest me where to look
>>
>> But there is one problem we can make unique key by this approch but
>> not primary key because primary key
>> is clustered and hashes can collide so i think we can't use hash field
>> as primary key. To overcome this problem
>> I have one idea instead of storing just hash we can make hash field
>> length 10 bytes and in last two bytes
>> we can store short int which tells how much time hash is repeated this
>> can make hash unique
>> in case of collusion. And also we are not doing more computation
>> because we already retrive
>> all records with same hashes.
>> What do you think of this idea?.
>> And there is one more problem how to make it foreign key.
>>
>> Will send you a prototype code tomorrow.
>> Regards
>> sachin
>>
>> On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>> wrote:
>> > Hi, Sachin!
>> >
>> > On Apr 13, Sachin Setia wrote:
>> >> Hello Sergei
>> >> Sorry I did not see your mail. Actually i was thinking something like
>> >> this before implementing the prototype but if i am more closer to
>> >> innodb the more performance i will i get. I will definitively think
>> >> about it.
>> >
>> > Great!
>> >
>> > Could you please tell me (mailing list, that is) what you think before
>> > next Monday (before April 18h, that is)?
>> >
>> > Regards,
>> > Sergei
>> > Chief Architect MariaDB
>> > and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Sat, 16 Apr 2016 19:37:45 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi Sergei!
>> As I already told you i was building prototype.It is some what completed
>> apart from one thing comparing of two field values. the difficulty is how
>> to get data length of field from table->record[1]. I will try to solve
>> it.
>> One more thing actually i got how mysql hide field. For example
>> condsider three
>> fields hash,data,data. mysql field pointer point at second field not at
>> hash
>> field and hash field ptr is stored in table->hash_field
>> can we do something similar to store hash fields(if we make array of
>> hashes in
>> case of more than one unique).But will adding member variable cause
>> problem?
>> what do you think?
>> Regards
>> sachin
>>
>> On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>> > Hi Sergei!
>> >
>> > Actually I was going through the mysql source code for unique long
>> constraints
>> > in file sql_tmp_table.cc in function create_tmp_table they make a new
>> field
>> > and a new key(hash_key) and pass this table obejct to storage
>> > engine.They actually
>> > refer this field as a hash field
>> > On the time of insert they call bool check_unique_constraint(TABLE
>> > *table) function
>> > which first calculate the hash and store it in field then they see for
>> duplicate
>> > hash and retrive ha_index_next_same if records are not same then record
>> >
>> > We can do the same thing in mariadb by adding one more field and key in
>> > mysql_prepare_create_table in this we check for blob with unlimited
>> > length or varchar for length
>> > greater then internal storage engine by doing this in
>> > mysql_prepare_create_table there
>> > will be no issues of frm file inconsistance.
>> >
>> > In case of insert first we will fill the hash field in fill_record
>> > function of sql_base.cc
>> > by first calculating the hash. Then we will retrive the index map
>> > using ha_index_read_map
>> > if returened value is zero then we will comapare two records and if
>> > they match then we will through error
>> > I am not sure where to place this code either in fill_record or later
>> > Or i can simple just
>> > fill hash in field in fill_record and then check for duplicates later
>> on.
>> >
>> > Current I am not sure how to hide columns from user.Sir, can you
>> > suggest me where to look
>> >
>> > But there is one problem we can make unique key by this approch but
>> > not primary key because primary key
>> > is clustered and hashes can collide so i think we can't use hash field
>> > as primary key. To overcome this problem
>> > I have one idea instead of storing just hash we can make hash field
>> > length 10 bytes and in last two bytes
>> > we can store short int which tells how much time hash is repeated this
>> > can make hash unique
>> > in case of collusion. And also we are not doing more computation
>> > because we already retrive
>> > all records with same hashes.
>> > What do you think of this idea?.
>> > And there is one more problem how to make it foreign key.
>> >
>> > Will send you a prototype code tomorrow.
>> > Regards
>> > sachin
>> >
>> > On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>> wrote:
>> >> Hi, Sachin!
>> >>
>> >> On Apr 13, Sachin Setia wrote:
>> >>> Hello Sergei
>> >>> Sorry I did not see your mail. Actually i was thinking something like
>> >>> this before implementing the prototype but if i am more closer to
>> >>> innodb the more performance i will i get. I will definitively think
>> >>> about it.
>> >>
>> >> Great!
>> >>
>> >> Could you please tell me (mailing list, that is) what you think before
>> >> next Monday (before April 18h, that is)?
>> >>
>> >> Regards,
>> >> Sergei
>> >> Chief Architect MariaDB
>> >> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Cc:
>> Date: Mon, 18 Apr 2016 20:12:39 +0530
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hello Sergei!
>> Hi Actually I was thinking about how to implement
>> blob as a foreign key.Foreign has to be unique which
>> we can already implement. To make it foreign key
>> we can either store unique hash or the whole blob
>> column.
>> But I am not sure much people
>> want to copy so long blob data in reference table.
>> Second option would be use blob hash as a reference key.
>> But user can not directly us hash as a reference key
>> because that is hidden.
>> What I was thinking of a clear to programmer way of
>> using blob hash. Suppose user can directly create
>> blob hash column ,use that column as a primary key or
>> foreign key.Like
>> create table t1(abc blob , blob hash(abc))//this will just create blob
>> hash column
>> create table t1(abc blob,unique(blob hash(abc))) // this will create
>> unique blob hash column
>> and similar for primary key and foreign key
>> user can enter hash value if they have some good algorithm
>> or if they do not give any value we will automatically
>> create and store hash. What do you think? sir.
>> Regards
>> sachin
>>
>>
>>
>> On Sat, Apr 16, 2016 at 7:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>>
>>> Hi Sergei!
>>> As I already told you i was building prototype.It is some what completed
>>> apart from one thing comparing of two field values. the difficulty is how
>>> to get data length of field from table->record[1]. I will try to solve
>>> it.
>>> One more thing actually i got how mysql hide field. For example
>>> condsider three
>>> fields hash,data,data. mysql field pointer point at second field not at
>>> hash
>>> field and hash field ptr is stored in table->hash_field
>>> can we do something similar to store hash fields(if we make array of
>>> hashes in
>>> case of more than one unique).But will adding member variable cause
>>> problem?
>>> what do you think?
>>> Regards
>>> sachin
>>>
>>> On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>>> wrote:
>>> > Hi Sergei!
>>> >
>>> > Actually I was going through the mysql source code for unique long
>>> constraints
>>> > in file sql_tmp_table.cc in function create_tmp_table they make a new
>>> field
>>> > and a new key(hash_key) and pass this table obejct to storage
>>> > engine.They actually
>>> > refer this field as a hash field
>>> > On the time of insert they call bool check_unique_constraint(TABLE
>>> > *table) function
>>> > which first calculate the hash and store it in field then they see for
>>> duplicate
>>> > hash and retrive ha_index_next_same if records are not same then record
>>> >
>>> > We can do the same thing in mariadb by adding one more field and key in
>>> > mysql_prepare_create_table in this we check for blob with unlimited
>>> > length or varchar for length
>>> > greater then internal storage engine by doing this in
>>> > mysql_prepare_create_table there
>>> > will be no issues of frm file inconsistance.
>>> >
>>> > In case of insert first we will fill the hash field in fill_record
>>> > function of sql_base.cc
>>> > by first calculating the hash. Then we will retrive the index map
>>> > using ha_index_read_map
>>> > if returened value is zero then we will comapare two records and if
>>> > they match then we will through error
>>> > I am not sure where to place this code either in fill_record or later
>>> > Or i can simple just
>>> > fill hash in field in fill_record and then check for duplicates later
>>> on.
>>> >
>>> > Current I am not sure how to hide columns from user.Sir, can you
>>> > suggest me where to look
>>> >
>>> > But there is one problem we can make unique key by this approch but
>>> > not primary key because primary key
>>> > is clustered and hashes can collide so i think we can't use hash field
>>> > as primary key. To overcome this problem
>>> > I have one idea instead of storing just hash we can make hash field
>>> > length 10 bytes and in last two bytes
>>> > we can store short int which tells how much time hash is repeated this
>>> > can make hash unique
>>> > in case of collusion. And also we are not doing more computation
>>> > because we already retrive
>>> > all records with same hashes.
>>> > What do you think of this idea?.
>>> > And there is one more problem how to make it foreign key.
>>> >
>>> > Will send you a prototype code tomorrow.
>>> > Regards
>>> > sachin
>>> >
>>> > On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>>> wrote:
>>> >> Hi, Sachin!
>>> >>
>>> >> On Apr 13, Sachin Setia wrote:
>>> >>> Hello Sergei
>>> >>> Sorry I did not see your mail. Actually i was thinking something like
>>> >>> this before implementing the prototype but if i am more closer to
>>> >>> innodb the more performance i will i get. I will definitively think
>>> >>> about it.
>>> >>
>>> >> Great!
>>> >>
>>> >> Could you please tell me (mailing list, that is) what you think before
>>> >> next Monday (before April 18h, that is)?
>>> >>
>>> >> Regards,
>>> >> Sergei
>>> >> Chief Architect MariaDB
>>> >> and security@xxxxxxxxxxx
>>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Date: Mon, 18 Apr 2016 22:32:25 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 15, Sachin Setia wrote:
>> > Hi Sergei!
>> >
>> > Actually I was going through the mysql source code for unique long
>> > constraints in file sql_tmp_table.cc in function create_tmp_table they
>> > make a new field and a new key(hash_key) and pass this table obejct to
>> > storage engine.They actually refer this field as a hash field On the
>> > time of insert they call bool check_unique_constraint(TABLE *table)
>> > function which first calculate the hash and store it in field then
>> > they see for duplicate hash and retrive ha_index_next_same if records
>> > are not same then record
>>
>> Right. Very good!
>>
>> > We can do the same thing in mariadb by adding one more field and key in
>> > mysql_prepare_create_table in this we check for blob with unlimited
>> > length or varchar for length greater then internal storage engine by
>> > doing this in mysql_prepare_create_table there will be no issues of frm
>> > file inconsistance.
>> >
>> > In case of insert first we will fill the hash field in fill_record
>> > function of sql_base.cc by first calculating the hash. Then we will
>> > retrive the index map using ha_index_read_map if returened value is zero
>> > then we will comapare two records and if they match then we will through
>> > error I am not sure where to place this code either in fill_record or
>> > later Or i can simple just fill hash in field in fill_record and then
>> > check for duplicates later on.
>>
>> MariaDB supports "virtual columns", see
>> https://mariadb.com/kb/en/mariadb/virtual-computed-columns/
>>
>> So, it might be enough to mark this hash column as virtual, and will be
>> automatically calculated when necessary (in fill_record, etc).
>>
>> > Current I am not sure how to hide columns from user.Sir, can you suggest
>> > me where to look
>>
>> This would need a new flag per field, like "hidden". And in all commands
>> that use a list of columns (SHOW, INFORMATION_SCHEMA tables, SELECT *,
>> INSERT table VALUE (...), etc) - this column should be skipped.
>>
>> But don't worry, I don't think this will be a problem for you. You can,
>> of course, start from not hiding this column and implement this hiddden
>> flag later.
>>
>> > But there is one problem we can make unique key by this approch but not
>> > primary key because primary key is clustered and hashes can collide so i
>> > think we can't use hash field as primary key.
>>
>> Of course.
>>
>> > To overcome this problem I have one idea instead of storing just hash
>> > we can make hash field length 10 bytes and in last two bytes we can
>> > store short int which tells how much time hash is repeated this can
>> > make hash unique in case of collusion. And also we are not doing more
>> > computation because we already retrive all records with same hashes.
>> > What do you think of this idea?. And there is one more problem how to
>> > make it foreign key.
>>
>> I would say, there is no need to bother. Let's just say that UNIQUE for
>> blobs is only a constraint, can not be used as a primary key, can not be
>> used as a foreign key. This is a reasonable limitation, I think :)
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Date: Mon, 18 Apr 2016 22:46:42 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 16, Sachin Setia wrote:
>> > Hi Sergei!
>> > As I already told you i was building prototype.It is some what
>> > completed apart from one thing comparing of two field values. the
>> > difficulty is how to get data length of field from table->record[1].
>> > I will try to solve it.
>>
>> Do you mean, a length of the blob value?
>> Try field->get_length()
>>
>> > One more thing actually i got how mysql hide field. For example
>> > condsider three fields hash,data,data. mysql field pointer point at
>> > second field not at hash field and hash field ptr is stored in
>> > table->hash_field can we do something similar to store hash fields(if
>> > we make array of hashes in case of more than one unique).But will
>> > adding member variable cause problem? what do you think?
>>
>> Well, yes, that would work.
>> But I see few other features where field hiding might be useful,
>> and in these cases this simple trick with moving table->field pointer
>> will not work, unfortunately.
>>
>> So we might need something more complex. But, again, this is not a most
>> important part of this project, so don't start from it.
>>
>> In fact, I'm rather interested to know about the adaptive hashing that
>> you've mentioned. Can you tell me more about it, please?
>>
>> > diff --git a/sql/sql_base.cc b/sql/sql_base.cc
>> > index ac2162b..291a3e2 100644
>> > --- a/sql/sql_base.cc
>> > +++ b/sql/sql_base.cc
>> > @@ -8956,6 +8964,31 @@ fill_record(THD *thd, TABLE *table, Field **ptr,
>> List<Item> &values,
>> > goto err;
>> > field->set_explicit_default(value);
>> > }
>> > + table->file->ha_index_init(0,0);
>> > + res =
>> table->file->ha_index_read_map(table->record[1],hash_field->ptr,HA_WHOLE_KEY,HA_READ_KEY_EXACT);
>> > + while(!res){
>> > + //compare the record if not sure how to compare it so just
>> assume it works
>> > + diff = table->record[1]-table->record[0];
>> > + src_length = blob_field->data_length();
>> > + //dest_length =
>> blob_field->data_length(table->record[1]); // i dont know how to get the
>> length from record 1
>> > + // so i am enable to do this
>> > + // then we can comapare records using
>> > + //field->cmp_max
>> > + //this is mysql code
>> > + /* if (!(table->distinct ?
>> > + table_rec_cmp(table) :
>> > + group_rec_cmp(table->group, table->record[0],
>> table->record[1])))
>> > + return false; // skip it
>> > + res= table->file->ha_index_next_same(table->record[1],
>> > + table->hash_field->ptr,
>> > + sizeof(hash)); */
>> > + //fetch the next record
>> > + res= table->file->ha_index_next_same(table->record[1],
>> > + hash_field->ptr,
>> > + 8);
>> > +
>> > + }
>>
>> not quite, you should check unique constraints in fill_record(). This
>> should happen when the row is actually inserted or updated. A good place
>> for this check, I'd say, is handler::ha_write_row() and
>> handler::ha_update_row().
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>> ---------- Forwarded message ----------
>> From: Sergei Golubchik <serg@xxxxxxxxxxx>
>> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
>> Date: Mon, 18 Apr 2016 22:51:57 +0200
>> Subject: Re: [Maria-developers] InnoDB blob for primary key
>> Hi, Sachin!
>>
>> On Apr 18, Sachin Setia wrote:
>> > Hi Actually I was thinking about how to implement blob as a foreign
>> > key.Foreign has to be unique which we can already implement. To make
>> > it foreign key we can either store unique hash or the whole blob
>> > column. But I am not sure much people want to copy so long blob data
>> > in reference table.
>>
>> Agree :)
>>
>> > Second option would be use blob hash as a reference key.
>> > But user can not directly us hash as a reference key
>> > because that is hidden.
>> > What I was thinking of a clear to programmer way of
>> > using blob hash. Suppose user can directly create
>> > blob hash column ,use that column as a primary key or
>> > foreign key.Like
>> > create table t1(abc blob , blob hash(abc))//this will just create blob
>> hash
>> > column
>> > create table t1(abc blob,unique(blob hash(abc))) // this will create
>> unique
>> > blob hash column
>> > and similar for primary key and foreign key
>> > user can enter hash value if they have some good algorithm
>> > or if they do not give any value we will automatically
>> > create and store hash. What do you think? sir.
>>
>> Mixed feelings. First, I wrote in an earlier email about using virtual
>> columns for that. In this line of thoughts, a user-specified hash
>> function is absolutely possible and logical.
>>
>> On the other hand, I don't see why anyone would need that - a hash is
>> not guaranteed to be unique, no matter what algorithm one uses.
>>
>> And, as I wrote in an earlier email, I don't think that primary/foreign
>> keys on blobs is a feature worth spending time on.
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@xxxxxxxxxxx
>>
>>
>
Follow ups
References
-
InnoDB blob for primary key
From: Sachin Setia, 2016-04-11
-
Re: InnoDB blob for primary key
From: Jan Lindström, 2016-04-12
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-13
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-13
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-15
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-16
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-19