maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #09509
Re: InnoDB blob for primary key
Hello Sergei!
By adaptive hashing I refer to something like in innodb adaptive hashing
http://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash.html
On Tue, Apr 19, 2016 at 6:55 PM, sachin setiya <sachinsetia1001@xxxxxxxxx>
wrote:
> Hello
>
>
> ---------- Forwarded message ----------
> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> To: maria-developers@xxxxxxxxxxxxxxxxxxx
> Cc:
> Date: Mon, 11 Apr 2016 15:03:24 +0530
> Subject: InnoDB blob for primary key
> Hello Developers,
> Hi this is sachin.Actually i was currently experimenting with with
> blob uniques in innodb
> there is three main problems
> 1.Unique blob
> 2.Blob Forigen key
> 3.Blob primary key
>
> 1. For blob unique we can simply store hash in unclustered index
> 2. Blob Forigen key i am currently working on it
> 3. Blob primary key :- for this i thought we create a 4 byte column
> which stores the hash of blob primary key.Internally this column will work
> as
> primary key and key for clustered index. I already successufully tested
> this
> here is my output
>
> MariaDB [sachin]> create table t4 (abc blob primary key);
> Query OK, 0 rows affected (0.10 sec)
>
> MariaDB [sachin]> insert into t4 values('sachin');
> Query OK, 1 row affected (0.01 sec)
>
> MariaDB [sachin]> insert into t4 values('sachin');
> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
> MariaDB [sachin]> insert into t4 values('sachin setiya');
> Query OK, 1 row affected (0.00 sec)
>
> MariaDB [sachin]> insert into t4 values('sachin setiya');
> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>
> MariaDB [sachin]> desc t4;
> +-------+------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+------+------+-----+---------+-------+
> | abc | blob | NO | PRI | NULL | |
> +-------+------+------+-----+---------+-------+
> 1 row in set (0.01 sec)
>
> MariaDB [sachin]> select * from t4;
> +---------------+
> | abc |
> +---------------+
> | sachin |
> | sachin setiya |
> +---------------+
> 2 rows in set (0.01 sec)
>
> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
> it does not matter either i got selected in gsoc i want to do work in
> innodb blob unique from today.Please sir allow me to do this
>
> I am including the patch file and t4.ibd and t4.frm file
> Regards
> sachin
>
>
>
>
> ---------- Forwarded message ----------
> From: "Jan Lindström" <jan.lindstrom@xxxxxxxxxxx>
> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> Cc: "maria-developers@xxxxxxxxxxxxxxxxxxx" <
> maria-developers@xxxxxxxxxxxxxxxxxxx>
> Date: Tue, 12 Apr 2016 14:22:29 +0300
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi,
>
> Sachin, naturally you may continue, this is open source, please read
> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
> From InnoDB point of view there is issue if you add a new system generated
> column to the row. Firstly, existing tables will not have that column and
> secondly it will effect e.g maximum row size and external tools like backup.
>
> R: Jan
>
> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
> wrote:
>
>> Hello Developers,
>> Hi this is sachin.Actually i was currently experimenting with with
>> blob uniques in innodb
>> there is three main problems
>> 1.Unique blob
>> 2.Blob Forigen key
>> 3.Blob primary key
>>
>> 1. For blob unique we can simply store hash in unclustered index
>> 2. Blob Forigen key i am currently working on it
>> 3. Blob primary key :- for this i thought we create a 4 byte column
>> which stores the hash of blob primary key.Internally this column will
>> work as
>> primary key and key for clustered index. I already successufully tested
>> this
>> here is my output
>>
>> MariaDB [sachin]> create table t4 (abc blob primary key);
>> Query OK, 0 rows affected (0.10 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> Query OK, 1 row affected (0.01 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> Query OK, 1 row affected (0.00 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>
>> MariaDB [sachin]> desc t4;
>> +-------+------+------+-----+---------+-------+
>> | Field | Type | Null | Key | Default | Extra |
>> +-------+------+------+-----+---------+-------+
>> | abc | blob | NO | PRI | NULL | |
>> +-------+------+------+-----+---------+-------+
>> 1 row in set (0.01 sec)
>>
>> MariaDB [sachin]> select * from t4;
>> +---------------+
>> | abc |
>> +---------------+
>> | sachin |
>> | sachin setiya |
>> +---------------+
>> 2 rows in set (0.01 sec)
>>
>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
>> it does not matter either i got selected in gsoc i want to do work in
>> innodb blob unique from today.Please sir allow me to do this
>>
>> I am including the patch file and t4.ibd and t4.frm file
>> Regards
>> sachin
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-developers
>> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~maria-developers
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>
>
> ---------- Forwarded message ----------
> From: Sergei Golubchik <vuvova@xxxxxxxxx>
> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> Cc:
> Date: Wed, 13 Apr 2016 00:04:22 +0200
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi, Sachin!
>
> Just a very quick comment now. We're having a company meeting
> (and a meetup -
> http://www.meetup.com/MariaDB-Developers-Berlin-Meetup/events/230026151/)
> in Berlin, I'll send you a longer reply when this is all over (in a
> couple of days).
>
> It might be an interesting option not to hack InnoDB, but to add
> this feature (long UNIQUE keys) on the upper level. That is, in sql/
> directory. Automatically add a new BIGINT column, but hide it from
> the user (not show it in SHOW COLUMNS and in SELECT *). Create
> an index for it (also hidden). Put the blob's hash into this column.
> And on collisions retrieve both rows and compare blobs. Just like MyISAM
> is doing, but not on the storage engine level, but in the
> engine-independent fashion, something that works for all engines.
>
> Want to look at this approach?
>
> MySQL 5.7 has something similar (long unique constraints on the sql
> layer), but they've done it only for internal temporary tables (like
> MyISAM's MI_UNIQUE is only used for internal temporary tables),
> and it's not directly available for the user. Still, it might give you
> some hinst about how to implement this feature on the sql layer.
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
>
> On Apr 11, Sachin Setia wrote:
> > Hello Developers,
> > Hi this is sachin.Actually i was currently experimenting with with
> > blob uniques in innodb
> > there is three main problems
> > 1.Unique blob
> > 2.Blob Forigen key
> > 3.Blob primary key
> >
> > 1. For blob unique we can simply store hash in unclustered index
> > 2. Blob Forigen key i am currently working on it
> > 3. Blob primary key :- for this i thought we create a 4 byte column
> > which stores the hash of blob primary key.Internally this column will
> work
> > as
> > primary key and key for clustered index. I already successufully tested
> > this
> > here is my output
> >
> > MariaDB [sachin]> create table t4 (abc blob primary key);
> > Query OK, 0 rows affected (0.10 sec)
> >
> > MariaDB [sachin]> insert into t4 values('sachin');
> > Query OK, 1 row affected (0.01 sec)
> >
> > MariaDB [sachin]> insert into t4 values('sachin');
> > ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
> > MariaDB [sachin]> insert into t4 values('sachin setiya');
> > Query OK, 1 row affected (0.00 sec)
> >
> > MariaDB [sachin]> insert into t4 values('sachin setiya');
> > ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
> >
> > MariaDB [sachin]> desc t4;
> > +-------+------+------+-----+---------+-------+
> > | Field | Type | Null | Key | Default | Extra |
> > +-------+------+------+-----+---------+-------+
> > | abc | blob | NO | PRI | NULL | |
> > +-------+------+------+-----+---------+-------+
> > 1 row in set (0.01 sec)
> >
> > MariaDB [sachin]> select * from t4;
> > +---------------+
> > | abc |
> > +---------------+
> > | sachin |
> > | sachin setiya |
> > +---------------+
> > 2 rows in set (0.01 sec)
> >
> > @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
> > it does not matter either i got selected in gsoc i want to do work in
> > innodb blob unique from today.Please sir allow me to do this
> >
> > I am including the patch file and t4.ibd and t4.frm file
> > Regards
> > sachin
>
> > diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> > index dfce503..efd6f22 100644
> > --- a/sql/sql_table.cc
> > +++ b/sql/sql_table.cc
> > @@ -3244,7 +3244,7 @@ mysql_prepare_create_table(THD *thd,
> HA_CREATE_INFO *create_info,
> > !(sql_field->charset= find_bin_collation(sql_field->charset)))
> > DBUG_RETURN(TRUE);
> >
> > - /*
> > + /*
> > Convert the default value from client character
> > set into the column character set if necessary.
> > */
> > @@ -3874,11 +3874,11 @@ mysql_prepare_create_table(THD *thd,
> HA_CREATE_INFO *create_info,
> > if (f_is_geom(sql_field->pack_flag) && sql_field->geom_type ==
> > Field::GEOM_POINT)
> > column->length= MAX_LEN_GEOM_POINT_FIELD;
> > - if (!column->length)
> > - {
> > - my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0),
> column->field_name.str);
> > - DBUG_RETURN(TRUE);
> > - }
> > +// if (!column->length) //change
> > +// {
> > +// my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0),
> column->field_name.str);
> > +// DBUG_RETURN(TRUE);
> > +// }
> > }
> > #ifdef HAVE_SPATIAL
> > if (key->type == Key::SPATIAL)
> > @@ -3992,9 +3992,9 @@ mysql_prepare_create_table(THD *thd,
> HA_CREATE_INFO *create_info,
> > }
> > else if (key_part_length == 0 && (sql_field->flags &
> NOT_NULL_FLAG))
> > {
> > - my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
> > - column->field_name.str);
> > - DBUG_RETURN(TRUE);
> > +// my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
> > +// column->field_name.str);
> > +// DBUG_RETURN(TRUE);
> > }
> > if (key_part_length > file->max_key_part_length() &&
> > key->type != Key::FULLTEXT)
> > diff --git a/storage/innobase/dict/dict0dict.cc
> b/storage/innobase/dict/dict0dict.cc
> > index c51deb0..f6e94af 100644
> > --- a/storage/innobase/dict/dict0dict.cc
> > +++ b/storage/innobase/dict/dict0dict.cc
> > @@ -1315,7 +1315,7 @@ dict_table_add_to_cache(
> > #define BIG_ROW_SIZE 1024
> >
> > ut_ad(mutex_own(&(dict_sys->mutex)));
> > -
> > +
> > dict_table_add_system_columns(table, heap);
> >
> > table->cached = TRUE;
> > diff --git a/storage/xtradb/dict/dict0dict.cc
> b/storage/xtradb/dict/dict0dict.cc
> > index 206038d..d8223d7 100644
> > --- a/storage/xtradb/dict/dict0dict.cc
> > +++ b/storage/xtradb/dict/dict0dict.cc
> > @@ -1251,7 +1251,7 @@ dict_table_open_on_name(
> > Adds system columns to a table object. */
> > UNIV_INTERN
> > void
> > -dict_table_add_system_columns(
> > +dict_table_add_system_columns( //work
> > /*==========================*/
> > dict_table_t* table, /*!< in/out: table */
> > mem_heap_t* heap) /*!< in: temporary heap */
> > @@ -1266,16 +1266,25 @@ dict_table_add_system_columns(
> > etc.) and as the last columns of the table memory object.
> > The clustered index will not always physically contain all
> > system columns. */
> > -
> > +// dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
> > +// ,DATA_NOT_NULL,DATA_BLOB_HASH_LEN);
> > dict_mem_table_add_col(table, heap, "DB_ROW_ID", DATA_SYS,
> > DATA_ROW_ID | DATA_NOT_NULL,
> > DATA_ROW_ID_LEN);
> > #if DATA_ROW_ID != 0
> > #error "DATA_ROW_ID != 0"
> > #endif
> > +// dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_SYS,
> > +//
> DATA_BLOB_HASH|DATA_NOT_NULL,
> > +//
> DATA_BLOB_HASH_LEN);
> > + //just a rough trick to get it working
> > +
> > +// if(*table->name=='y'){
> > dict_mem_table_add_col(table, heap, "DB_TRX_ID", DATA_SYS,
> > DATA_TRX_ID | DATA_NOT_NULL,
> > DATA_TRX_ID_LEN);
> > +// }
> > +
> > #if DATA_TRX_ID != 1
> > #error "DATA_TRX_ID != 1"
> > #endif
> > @@ -1310,7 +1319,11 @@ dict_table_add_to_cache(
> > ulint row_len;
> >
> > ut_ad(dict_lru_validate());
> > -
> > +// bool x =false;//break
> > +// if(x){
> > +// dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
> > +//
> ,DATA_NOT_NULL,4);
> > +// }
> > /* The lower limit for what we consider a "big" row */
> > #define BIG_ROW_SIZE 1024
> >
> > @@ -3075,7 +3088,7 @@ dict_index_build_internal_clust(
> > /* Copy the fields of index */
> > dict_index_copy(new_index, index, table, 0, index->n_fields);
> >
> > - if (dict_index_is_univ(index)) {
> > + if (dict_index_is_univ(index)) { //work
> > /* No fixed number of fields determines an entry uniquely
> */
> >
> > new_index->n_uniq = REC_MAX_N_FIELDS;
> > @@ -3124,7 +3137,7 @@ dict_index_build_internal_clust(
> > DATA_ROLL_PTR),
> > 0);
> >
> > - for (i = 0; i < trx_id_pos; i++) {
> > + for (i = 0; i < trx_id_pos; i++) {//work i think i need to
> do some stuff
> >
> > ulint fixed_size = dict_col_get_fixed_size(
> > dict_index_get_nth_col(new_index, i),
> > diff --git a/storage/xtradb/dict/dict0load.cc
> b/storage/xtradb/dict/dict0load.cc
> > index d6ed8ac..8cc59f8 100644
> > --- a/storage/xtradb/dict/dict0load.cc
> > +++ b/storage/xtradb/dict/dict0load.cc
> > @@ -2276,7 +2276,7 @@ dictionary cache.
> > ibd_file_missing flag TRUE in the table object we return */
> > UNIV_INTERN
> > dict_table_t*
> > -dict_load_table(
> > +dict_load_table( //work for corruped table
> > /*============*/
> > const char* name, /*!< in: table name in the
> > databasename/tablename format */
> > @@ -2337,7 +2337,7 @@ dict_load_table(
> > btr_pcur_close(&pcur);
> > mtr_commit(&mtr);
> > mem_heap_free(heap);
> > -
> > +
> > return(NULL);
> > }
> >
> > diff --git a/storage/xtradb/dict/dict0mem.cc
> b/storage/xtradb/dict/dict0mem.cc
> > index a4f6cd6..6cbe556 100644
> > --- a/storage/xtradb/dict/dict0mem.cc
> > +++ b/storage/xtradb/dict/dict0mem.cc
> > @@ -101,11 +101,11 @@ dict_mem_table_create(
> > memcpy(table->name, name, strlen(name) + 1);
> > table->is_system_db = dict_mem_table_is_system(table->name);
> > table->space = (unsigned int) space;
> > - table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS);
> > -
> > + table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS); //work
> > +
> > table->cols = static_cast<dict_col_t*>(
> > mem_heap_alloc(heap,
> > - (n_cols + DATA_N_SYS_COLS)
> > + (n_cols + DATA_N_SYS_COLS)//work
> > * sizeof(dict_col_t)));
> >
> > ut_d(table->magic_n = DICT_TABLE_MAGIC_N);
> > diff --git a/storage/xtradb/handler/ha_innodb.cc
> b/storage/xtradb/handler/ha_innodb.cc
> > index e5edf76..3a8dc91 100644
> > --- a/storage/xtradb/handler/ha_innodb.cc
> > +++ b/storage/xtradb/handler/ha_innodb.cc
> > @@ -5983,7 +5983,7 @@ ha_innobase::open(
> > ibool par_case_name_set = FALSE;
> > char par_case_name[FN_REFLEN];
> > dict_err_ignore_t ignore_err = DICT_ERR_IGNORE_NONE;
> > -
> > + ibool is_blob_primary_key=false;
> > DBUG_ENTER("ha_innobase::open");
> >
> > UT_NOT_USED(mode);
> > @@ -6031,13 +6031,20 @@ ha_innobase::open(
> >
> > /* Get pointer to a table object in InnoDB dictionary cache */
> > ib_table = dict_table_open_on_name(norm_name, FALSE, TRUE,
> ignore_err);
> > -
> > + //int number_of_columns = dict_table_get_n_user_cols(ib_table);
> > + //if(ib_table->)
> > + if(ib_table){
> > + int number_of_columns =
> dict_table_get_n_user_cols(ib_table);
> > +
> if(!innobase_strcasecmp(ib_table->col_names,"DB_BLOB_HASH")){
> > + is_blob_primary_key=true;
> > + number_of_columns--;//stodo i think we need to add
> flag for blob primary key to make checking easier
> > + }
> > if (ib_table
> > - && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID)
> > - && table->s->stored_fields !=
> dict_table_get_n_user_cols(ib_table))
> > + && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID))
> > + && table->s->stored_fields != number_of_columns) //work
> > || (DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID)
> > && (table->s->fields
> > - != dict_table_get_n_user_cols(ib_table) - 1)))) {
> > + != dict_table_get_n_user_cols(ib_table) - 1))) {
> > ib_logf(IB_LOG_LEVEL_WARN,
> > "table %s contains %lu user defined columns "
> > "in InnoDB, but %lu columns in MySQL. Please "
> > @@ -6054,7 +6061,7 @@ ha_innobase::open(
> > ib_table = NULL;
> > is_part = NULL;
> > }
> > -
> > + }
> > if (UNIV_UNLIKELY(ib_table && ib_table->is_corrupt &&
> > srv_pass_corrupt_table <= 1)) {
> > free_share(share);
> > @@ -6254,12 +6261,12 @@ ha_innobase::open(
> > /* Looks like MySQL-3.23 sometimes has primary key number != 0 */
> > primary_key = table->s->primary_key;
> > key_used_on_scan = primary_key;
> > -
> > + if(!is_blob_primary_key){
> > if (!innobase_build_index_translation(table, ib_table, share)) {
> > sql_print_error("Build InnoDB index translation table
> for"
> > " Table %s failed", name);
> > }
> > -
> > + }
> > /* Allocate a buffer for a 'row reference'. A row reference is
> > a string of bytes of length ref_length which uniquely specifies
> > a row in our table. Note that MySQL may also compare two row
> > @@ -6314,7 +6321,11 @@ ha_innobase::open(
> > for (uint i = 0; i < table->s->keys; i++) {
> > dict_index_t* index;
> > index = innobase_get_index(i);
> > - if (dict_index_is_clust(index)) {
> > + if (dict_index_is_clust(index)) { //work
> > + if(is_blob_primary_key){
> > + ref_length=4; //hash length
> > + continue;
> > + }
> > ref_length =
> >
> table->key_info[i].key_length;
> > }
> > @@ -7795,7 +7806,8 @@ build_template_field(
> >
> > //ut_ad(field == table->field[i]);
> > ut_ad(clust_index->table == index->table);
> > -
> > +//work here we go
> > +//todo it should get the next column defs
> > col = dict_table_get_nth_col(index->table, i);
> >
> > templ = prebuilt->mysql_template + prebuilt->n_template++;
> > @@ -8138,7 +8150,8 @@ ha_innobase::build_template(
> > continue;
> > }
> > }
> > -
> > +
> if(!innobase_strcasecmp(clust_index->fields->name,"DB_BLOB_HASH"))
> > + i++;
> > build_template_field(prebuilt, clust_index, index,
> > table, field, i);
> > }
> > @@ -11065,6 +11078,7 @@ create_table_def(
> > ulint doc_id_col = 0;
> > ibool has_doc_id_col = FALSE;
> > mem_heap_t* heap;
> > + bool is_blob_primary_key=false;
> >
> > DBUG_ENTER("create_table_def");
> > DBUG_PRINT("enter", ("table_name: %s", table_name));
> > @@ -11126,6 +11140,12 @@ create_table_def(
> > table->fts->doc_col = doc_id_col;
> > }
> > } else {
> > + if(form->key_info[0].key_part->length==0 && //change
> > + form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
> > + s_cols++;
> > + is_blob_primary_key=true;
> > + }
> > +
> > table = dict_mem_table_create(table_name, 0, s_cols,
> > flags, flags2);
> > }
> > @@ -11143,6 +11163,13 @@ create_table_def(
> > table->data_dir_path = NULL;
> > }
> > heap = mem_heap_create(1000);
> > + //work
> > + //add one more column for hash
> > +
> > + if(is_blob_primary_key){
> > + dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",
> > + DATA_INT,1283,4);
> > + }
> >
> > for (i = 0; i < n_cols; i++) {
> > Field* field = form->field[i];
> > @@ -11222,7 +11249,7 @@ create_table_def(
> > err = DB_ERROR;
> > goto error_ret;
> > }
> > -
> > +
> > dict_mem_table_add_col(table, heap,
> > field->field_name,
> > col_type,
> > @@ -11347,7 +11374,7 @@ create_index(
> > the length of the key part versus the column. */
> >
> > Field* field = NULL;
> > -
> > +//work
> > for (ulint j = 0; j < form->s->fields; j++) {
> >
> > field = form->field[j];
> > @@ -11396,7 +11423,12 @@ create_index(
> > }
> >
> > field_lengths[i] = key_part->length;
> > -
> > + if(form->key_info[0].key_part->length==0 && //change
> > + form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
> > + dict_mem_index_add_field(
> > + index,"DB_BLOB_HASH", 0);
> > + continue;
> > + }
> > dict_mem_index_add_field(
> > index, key_part->field->field_name, prefix_len);
> > }
> > diff --git a/storage/xtradb/include/data0type.h
> b/storage/xtradb/include/data0type.h
> > index 111664b..0f510e8 100644
> > --- a/storage/xtradb/include/data0type.h
> > +++ b/storage/xtradb/include/data0type.h
> > @@ -147,6 +147,9 @@ be less than 256 */
> > #define DATA_ROLL_PTR 2 /* rollback data pointer: 7 bytes
> */
> > #define DATA_ROLL_PTR_LEN 7
> >
> > +//#define DATA_BLOB_HASH 3 // hash coloumn for blob primay key
> > +//#define DATA_BLOB_HASH_LEN 4 //used as a clustered index
> > +
> > #define DATA_N_SYS_COLS 3 /* number of system columns
> defined above */
> >
> > #define DATA_FTS_DOC_ID 3 /* Used as FTS DOC ID column */
> > diff --git a/storage/xtradb/que/que0que.cc
> b/storage/xtradb/que/que0que.cc
> > index e2dc023..076785d 100644
> > --- a/storage/xtradb/que/que0que.cc
> > +++ b/storage/xtradb/que/que0que.cc
> > @@ -1030,7 +1030,7 @@ que_thr_step(
> > que_node_print_info(node);
> > }
> > #endif
> > - if (type & QUE_NODE_CONTROL_STAT) {
> > + if (type & QUE_NODE_CONTROL_STAT) { //improve
> > if ((thr->prev_node != que_node_get_parent(node))
> > && que_node_get_next(thr->prev_node)) {
> >
> > diff --git a/storage/xtradb/row/row0ins.cc
> b/storage/xtradb/row/row0ins.cc
> > index d02e179..c58d7fb 100644
> > --- a/storage/xtradb/row/row0ins.cc
> > +++ b/storage/xtradb/row/row0ins.cc
> > @@ -137,7 +137,7 @@ row_ins_alloc_sys_fields(
> > /*=====================*/
> > ins_node_t* node) /*!< in: insert node */
> > {
> > - dtuple_t* row;
> > + dtuple_t* row; //work
> > dict_table_t* table;
> > mem_heap_t* heap;
> > const dict_col_t* col;
> > @@ -3118,7 +3118,7 @@ row_ins_index_entry_set_vals(
> >
> > n_fields = dtuple_get_n_fields(entry);
> >
> > - for (i = 0; i < n_fields; i++) {
> > + for (i = 0; i < n_fields; i++) {//see
> > dict_field_t* ind_field;
> > dfield_t* field;
> > const dfield_t* row_field;
> > @@ -3169,8 +3169,8 @@ row_ins_index_entry_step(
> >
> > ut_ad(dtuple_check_typed(node->row));
> >
> > - row_ins_index_entry_set_vals(node->index, node->entry, node->row);
> > -
> > + row_ins_index_entry_set_vals(node->index, node->entry,
> node->row);//explore
> > +//need to see who sets the entry in row
> > ut_ad(dtuple_check_typed(node->entry));
> >
> > err = row_ins_index_entry(node->index, node->entry, thr);
> > @@ -3208,7 +3208,7 @@ row_ins_alloc_row_id_step(
> >
> > /* Fill in row id value to row */
> >
> > - row_id = dict_sys_get_new_row_id();
> > + row_id = dict_sys_get_new_row_id(); //work
> >
> > dict_sys_write_row_id(node->row_id_buf, row_id);
> > }
> > diff --git a/storage/xtradb/row/row0mysql.cc
> b/storage/xtradb/row/row0mysql.cc
> > index 9427b20..b28e778 100644
> > --- a/storage/xtradb/row/row0mysql.cc
> > +++ b/storage/xtradb/row/row0mysql.cc
> > @@ -510,7 +510,15 @@ row_mysql_store_col_in_innobase_format(
> >
> > return(buf);
> > }
> > -
> > +//this is just dummy function
> > +int dummuy_hash(byte * ptr,int length){
> > + int val=0;
> > + for(int i=0;i<length;i++){
> > + val =val*8+(int )*ptr;
> > + ptr++;
> > + }
> > + return val;
> > +}
> > /**************************************************************//**
> > Convert a row in the MySQL format to a row in the Innobase format. Note
> that
> > the function to convert a MySQL format key value to an InnoDB dtuple is
> > @@ -532,15 +540,22 @@ row_mysql_convert_row_to_innobase(
> > const mysql_row_templ_t*templ;
> > dfield_t* dfield;
> > ulint i;
> > -
> > + ibool is_blob_primary_key=false;
> > + ibool is_first_template=true;
> > ut_ad(prebuilt->template_type == ROW_MYSQL_WHOLE_ROW);
> > - ut_ad(prebuilt->mysql_template);
> > -
> > - for (i = 0; i < prebuilt->n_template; i++) {
> > + ut_ad(prebuilt->mysql_template);//work
> > +
> if(!innobase_strcasecmp(prebuilt->table->col_names,"DB_BLOB_HASH")){
> > + is_blob_primary_key =true;
> > + }
> > + for (i = 0; i < prebuilt->n_template; i++) { //change add one
> >
> > templ = prebuilt->mysql_template + i;
> > + if(is_blob_primary_key&&is_first_template){
> > + dfield = dtuple_get_nth_field(row, i+1);
> > + is_first_template=false;
> > + }else{
> > dfield = dtuple_get_nth_field(row, i);
> > -
> > + }
> > if (templ->mysql_null_bit_mask != 0) {
> > /* Column may be SQL NULL */
> >
> > @@ -565,6 +580,17 @@ row_mysql_convert_row_to_innobase(
> > next_column:
> > ;
> > }
> > + if(is_blob_primary_key){
> > + //get the first field and set the hash
> > + dfield_t * hash_field=dtuple_get_nth_field(row,0);
> > + dfield_t * blob_field=dtuple_get_nth_field(row,1);
> > + byte * hash = static_cast<byte
> *>(mem_heap_zalloc(prebuilt->heap,4));
> > + mach_write_to_4(hash,dummuy_hash((byte
> *)blob_field->data,blob_field->len));
> > + //4 is hash length
> > + hash_field->len=4;
> > + hash_field->data=hash;
> > + //hash_field->type
> > + }
> >
> > /* If there is a FTS doc id column and it is not user supplied (
> > generated by server) then assign it a new doc id. */
> > @@ -831,7 +857,7 @@ row_create_prebuilt(
> >
> > prebuilt->search_tuple = dtuple_create(heap,
> search_tuple_n_fields);
> >
> > - ref = dtuple_create(heap, ref_len);
> > + ref = dtuple_create(heap, ref_len);//work
> >
> > dict_index_copy_types(ref, clust_index, ref_len);
> >
> > @@ -1057,7 +1083,7 @@ row_get_prebuilt_insert_row(
> >
> > dict_table_copy_types(row, table);
> >
> > - ins_node_set_new_row(node, row);
> > + ins_node_set_new_row(node, row);//explore
> >
> > prebuilt->ins_graph = static_cast<que_fork_t*>(
> > que_node_get_parent(
> > @@ -1353,7 +1379,7 @@ row_insert_for_mysql(
> > row_get_prebuilt_insert_row(prebuilt);
> > node = prebuilt->ins_node;
> >
> > - row_mysql_convert_row_to_innobase(node->row, prebuilt, mysql_rec);
> > + row_mysql_convert_row_to_innobase(node->row, prebuilt,
> mysql_rec);//debug
> >
> > savept = trx_savept_take(trx);
> >
> > diff --git a/storage/xtradb/row/row0row.cc
> b/storage/xtradb/row/row0row.cc
> > index be786f9..a1dd465 100644
> > --- a/storage/xtradb/row/row0row.cc
> > +++ b/storage/xtradb/row/row0row.cc
> > @@ -83,7 +83,7 @@ row_build_index_entry_low(
> > entry, dict_index_get_n_unique_in_tree(index));
> > }
> >
> > - for (i = 0; i < entry_len; i++) {
> > + for (i = 0; i < entry_len; i++) { //explore need to see how it
> works for simple
> > const dict_field_t* ind_field
> > = dict_index_get_nth_field(index, i);
> > const dict_col_t* col
>
>
>
> ---------- Forwarded message ----------
> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> To: maria-developers@xxxxxxxxxxxxxxxxxxx
> Cc:
> Date: Wed, 13 Apr 2016 21:17:03 +0530
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hello Jan!
>
> Actually I was going through the problems raised by you
>
> 1>Firstly, existing tables will not have that column
>
> Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
> column, but this is fine because they indexed there blob column then
> they must have provided blob length, and if so then length of key_part
> would not be zero and so need to make changes in patch to detect this.
> And also by doing I can also eliminate another problem what if existing
> table have a coloumn named 'DB_BLOB_HASH'
>
> 2>secondly it will effect e.g maximum row size
> For this i can increase reclength by 4 in pack_header function(but i
> am not sure whether this will work or not)
>
> 3>and external tools like backup
> Here, I am assuming that by backup you mean mysqldump actually i tried
> this by create a table with blob coloum (length was given ) and than
> changing mariadb binary(to changed code) and .sql file of backup was
> working fine
>
> But I think there are more problems with this this prototype
> 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
> as written above but some code in patch compares first coloumn name
> with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
> in dict_table_t->flag so that we can reliable check for blob hash coloumn
>
> 2>This is very bad problem , what happen if if data is different and crc
> hash of data is same.And this will occur when we have medium amount of
> data like 100 thousand records.To avoid this we can do like
>
> 2.1> use crc 64 reduces probability
> 2.2> use two different algorithm for hash and treat these two algoriths
> hash for uniquesness.Make collusion probability very very less
>
> But the main problem is what would happen if collusion occurs to solve
> this i have two ideas
> 2.3> like 2.2 but instead of using two algorithm use one and add column
> for number of times this hash is repeated collusion frequency will be zero
> i think this will work fine but need to test it
> 2.4> whole prototype is crap start again with some different method. Can
> you
> please suggest some differnt way of doing this
> Regards
> sachin
>
> On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
> <jan.lindstrom@xxxxxxxxxxx> wrote:
> > Hi,
> >
> > Sachin, naturally you may continue, this is open source, please read
> >
> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
> > From InnoDB point of view there is issue if you add a new system
> generated
> > column to the row. Firstly, existing tables will not have that column and
> > secondly it will effect e.g maximum row size and external tools like
> backup.
> >
> > R: Jan
> >
> > On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <
> sachinsetia1001@xxxxxxxxx>
> > wrote:
> >>
> >> Hello Developers,
> >> Hi this is sachin.Actually i was currently experimenting with with
> >> blob uniques in innodb
> >> there is three main problems
> >> 1.Unique blob
> >> 2.Blob Forigen key
> >> 3.Blob primary key
> >>
> >> 1. For blob unique we can simply store hash in unclustered index
> >> 2. Blob Forigen key i am currently working on it
> >> 3. Blob primary key :- for this i thought we create a 4 byte column
> >> which stores the hash of blob primary key.Internally this column will
> work
> >> as
> >> primary key and key for clustered index. I already successufully tested
> >> this
> >> here is my output
> >>
> >> MariaDB [sachin]> create table t4 (abc blob primary key);
> >> Query OK, 0 rows affected (0.10 sec)
> >>
> >> MariaDB [sachin]> insert into t4 values('sachin');
> >> Query OK, 1 row affected (0.01 sec)
> >>
> >> MariaDB [sachin]> insert into t4 values('sachin');
> >> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
> >> MariaDB [sachin]> insert into t4 values('sachin setiya');
> >> Query OK, 1 row affected (0.00 sec)
> >>
> >> MariaDB [sachin]> insert into t4 values('sachin setiya');
> >> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
> >>
> >> MariaDB [sachin]> desc t4;
> >> +-------+------+------+-----+---------+-------+
> >> | Field | Type | Null | Key | Default | Extra |
> >> +-------+------+------+-----+---------+-------+
> >> | abc | blob | NO | PRI | NULL | |
> >> +-------+------+------+-----+---------+-------+
> >> 1 row in set (0.01 sec)
> >>
> >> MariaDB [sachin]> select * from t4;
> >> +---------------+
> >> | abc |
> >> +---------------+
> >> | sachin |
> >> | sachin setiya |
> >> +---------------+
> >> 2 rows in set (0.01 sec)
> >>
> >> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
> now
> >> it does not matter either i got selected in gsoc i want to do work in
> >> innodb blob unique from today.Please sir allow me to do this
> >>
> >> I am including the patch file and t4.ibd and t4.frm file
> >> Regards
> >> sachin
> >>
> >>
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~maria-developers
> >> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
> >> Unsubscribe : https://launchpad.net/~maria-developers
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >
>
>
> ---------- Forwarded message ----------
> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> To: maria-developers@xxxxxxxxxxxxxxxxxxx
> Cc:
> Date: Wed, 13 Apr 2016 22:19:44 +0530
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hello Sergei
> Sorry I did not see your mail. Actually i was thinking something like
> this before implementing the prototype but if i am more closer to
> innodb the more performance i will i get. I will definitively think
> about it.
> Regards
> sachin
>
> On Wed, Apr 13, 2016 at 9:17 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
> wrote:
> > Hello Jan!
> >
> > Actually I was going through the problems raised by you
> >
> > 1>Firstly, existing tables will not have that column
> >
> > Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
> > column, but this is fine because they indexed there blob column then
> > they must have provided blob length, and if so then length of key_part
> > would not be zero and so need to make changes in patch to detect this.
> > And also by doing I can also eliminate another problem what if existing
> > table have a coloumn named 'DB_BLOB_HASH'
> >
> > 2>secondly it will effect e.g maximum row size
> > For this i can increase reclength by 4 in pack_header function(but i
> > am not sure whether this will work or not)
> >
> > 3>and external tools like backup
> > Here, I am assuming that by backup you mean mysqldump actually i tried
> > this by create a table with blob coloum (length was given ) and than
> > changing mariadb binary(to changed code) and .sql file of backup was
> > working fine
> >
> > But I think there are more problems with this this prototype
> > 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
> > as written above but some code in patch compares first coloumn name
> > with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
> > in dict_table_t->flag so that we can reliable check for blob hash coloumn
> >
> > 2>This is very bad problem , what happen if if data is different and crc
> > hash of data is same.And this will occur when we have medium amount of
> > data like 100 thousand records.To avoid this we can do like
> >
> > 2.1> use crc 64 reduces probability
> > 2.2> use two different algorithm for hash and treat these two algoriths
> > hash for uniquesness.Make collusion probability very very less
> >
> > But the main problem is what would happen if collusion occurs to solve
> > this i have two ideas
> > 2.3> like 2.2 but instead of using two algorithm use one and add column
> > for number of times this hash is repeated collusion frequency will be
> zero
> > i think this will work fine but need to test it
> > 2.4> whole prototype is crap start again with some different method. Can
> you
> > please suggest some differnt way of doing this
> > Regards
> > sachin
> >
> > On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
> > <jan.lindstrom@xxxxxxxxxxx> wrote:
> >> Hi,
> >>
> >> Sachin, naturally you may continue, this is open source, please read
> >>
> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
> >> From InnoDB point of view there is issue if you add a new system
> generated
> >> column to the row. Firstly, existing tables will not have that column
> and
> >> secondly it will effect e.g maximum row size and external tools like
> backup.
> >>
> >> R: Jan
> >>
> >> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <
> sachinsetia1001@xxxxxxxxx>
> >> wrote:
> >>>
> >>> Hello Developers,
> >>> Hi this is sachin.Actually i was currently experimenting with with
> >>> blob uniques in innodb
> >>> there is three main problems
> >>> 1.Unique blob
> >>> 2.Blob Forigen key
> >>> 3.Blob primary key
> >>>
> >>> 1. For blob unique we can simply store hash in unclustered index
> >>> 2. Blob Forigen key i am currently working on it
> >>> 3. Blob primary key :- for this i thought we create a 4 byte column
> >>> which stores the hash of blob primary key.Internally this column will
> work
> >>> as
> >>> primary key and key for clustered index. I already successufully tested
> >>> this
> >>> here is my output
> >>>
> >>> MariaDB [sachin]> create table t4 (abc blob primary key);
> >>> Query OK, 0 rows affected (0.10 sec)
> >>>
> >>> MariaDB [sachin]> insert into t4 values('sachin');
> >>> Query OK, 1 row affected (0.01 sec)
> >>>
> >>> MariaDB [sachin]> insert into t4 values('sachin');
> >>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
> >>> MariaDB [sachin]> insert into t4 values('sachin setiya');
> >>> Query OK, 1 row affected (0.00 sec)
> >>>
> >>> MariaDB [sachin]> insert into t4 values('sachin setiya');
> >>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
> >>>
> >>> MariaDB [sachin]> desc t4;
> >>> +-------+------+------+-----+---------+-------+
> >>> | Field | Type | Null | Key | Default | Extra |
> >>> +-------+------+------+-----+---------+-------+
> >>> | abc | blob | NO | PRI | NULL | |
> >>> +-------+------+------+-----+---------+-------+
> >>> 1 row in set (0.01 sec)
> >>>
> >>> MariaDB [sachin]> select * from t4;
> >>> +---------------+
> >>> | abc |
> >>> +---------------+
> >>> | sachin |
> >>> | sachin setiya |
> >>> +---------------+
> >>> 2 rows in set (0.01 sec)
> >>>
> >>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me
> now
> >>> it does not matter either i got selected in gsoc i want to do work in
> >>> innodb blob unique from today.Please sir allow me to do this
> >>>
> >>> I am including the patch file and t4.ibd and t4.frm file
> >>> Regards
> >>> sachin
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Mailing list: https://launchpad.net/~maria-developers
> >>> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
> >>> Unsubscribe : https://launchpad.net/~maria-developers
> >>> More help : https://help.launchpad.net/ListHelp
> >>>
> >>
>
>
> ---------- Forwarded message ----------
> From: Sergei Golubchik <serg@xxxxxxxxxxx>
> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> Cc:
> Date: Thu, 14 Apr 2016 20:53:51 +0200
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi, Sachin!
>
> On Apr 13, Sachin Setia wrote:
> > Hello Sergei
> > Sorry I did not see your mail. Actually i was thinking something like
> > this before implementing the prototype but if i am more closer to
> > innodb the more performance i will i get. I will definitively think
> > about it.
>
> Great!
>
> Could you please tell me (mailing list, that is) what you think before
> next Monday (before April 18h, that is)?
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
>
>
> ---------- Forwarded message ----------
> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> To: maria-developers@xxxxxxxxxxxxxxxxxxx
> Cc:
> Date: Fri, 15 Apr 2016 18:37:15 +0530
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi Sergei!
>
> Actually I was going through the mysql source code for unique long
> constraints
> in file sql_tmp_table.cc in function create_tmp_table they make a new field
> and a new key(hash_key) and pass this table obejct to storage
> engine.They actually
> refer this field as a hash field
> On the time of insert they call bool check_unique_constraint(TABLE
> *table) function
> which first calculate the hash and store it in field then they see for
> duplicate
> hash and retrive ha_index_next_same if records are not same then record
>
> We can do the same thing in mariadb by adding one more field and key in
> mysql_prepare_create_table in this we check for blob with unlimited
> length or varchar for length
> greater then internal storage engine by doing this in
> mysql_prepare_create_table there
> will be no issues of frm file inconsistance.
>
> In case of insert first we will fill the hash field in fill_record
> function of sql_base.cc
> by first calculating the hash. Then we will retrive the index map
> using ha_index_read_map
> if returened value is zero then we will comapare two records and if
> they match then we will through error
> I am not sure where to place this code either in fill_record or later
> Or i can simple just
> fill hash in field in fill_record and then check for duplicates later on.
>
> Current I am not sure how to hide columns from user.Sir, can you
> suggest me where to look
>
> But there is one problem we can make unique key by this approch but
> not primary key because primary key
> is clustered and hashes can collide so i think we can't use hash field
> as primary key. To overcome this problem
> I have one idea instead of storing just hash we can make hash field
> length 10 bytes and in last two bytes
> we can store short int which tells how much time hash is repeated this
> can make hash unique
> in case of collusion. And also we are not doing more computation
> because we already retrive
> all records with same hashes.
> What do you think of this idea?.
> And there is one more problem how to make it foreign key.
>
> Will send you a prototype code tomorrow.
> Regards
> sachin
>
> On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
> wrote:
> > Hi, Sachin!
> >
> > On Apr 13, Sachin Setia wrote:
> >> Hello Sergei
> >> Sorry I did not see your mail. Actually i was thinking something like
> >> this before implementing the prototype but if i am more closer to
> >> innodb the more performance i will i get. I will definitively think
> >> about it.
> >
> > Great!
> >
> > Could you please tell me (mailing list, that is) what you think before
> > next Monday (before April 18h, that is)?
> >
> > Regards,
> > Sergei
> > Chief Architect MariaDB
> > and security@xxxxxxxxxxx
>
>
> ---------- Forwarded message ----------
> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> To: maria-developers@xxxxxxxxxxxxxxxxxxx
> Cc:
> Date: Sat, 16 Apr 2016 19:37:45 +0530
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi Sergei!
> As I already told you i was building prototype.It is some what completed
> apart from one thing comparing of two field values. the difficulty is how
> to get data length of field from table->record[1]. I will try to solve it.
> One more thing actually i got how mysql hide field. For example condsider
> three
> fields hash,data,data. mysql field pointer point at second field not at
> hash
> field and hash field ptr is stored in table->hash_field
> can we do something similar to store hash fields(if we make array of
> hashes in
> case of more than one unique).But will adding member variable cause
> problem?
> what do you think?
> Regards
> sachin
>
> On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
> wrote:
> > Hi Sergei!
> >
> > Actually I was going through the mysql source code for unique long
> constraints
> > in file sql_tmp_table.cc in function create_tmp_table they make a new
> field
> > and a new key(hash_key) and pass this table obejct to storage
> > engine.They actually
> > refer this field as a hash field
> > On the time of insert they call bool check_unique_constraint(TABLE
> > *table) function
> > which first calculate the hash and store it in field then they see for
> duplicate
> > hash and retrive ha_index_next_same if records are not same then record
> >
> > We can do the same thing in mariadb by adding one more field and key in
> > mysql_prepare_create_table in this we check for blob with unlimited
> > length or varchar for length
> > greater then internal storage engine by doing this in
> > mysql_prepare_create_table there
> > will be no issues of frm file inconsistance.
> >
> > In case of insert first we will fill the hash field in fill_record
> > function of sql_base.cc
> > by first calculating the hash. Then we will retrive the index map
> > using ha_index_read_map
> > if returened value is zero then we will comapare two records and if
> > they match then we will through error
> > I am not sure where to place this code either in fill_record or later
> > Or i can simple just
> > fill hash in field in fill_record and then check for duplicates later on.
> >
> > Current I am not sure how to hide columns from user.Sir, can you
> > suggest me where to look
> >
> > But there is one problem we can make unique key by this approch but
> > not primary key because primary key
> > is clustered and hashes can collide so i think we can't use hash field
> > as primary key. To overcome this problem
> > I have one idea instead of storing just hash we can make hash field
> > length 10 bytes and in last two bytes
> > we can store short int which tells how much time hash is repeated this
> > can make hash unique
> > in case of collusion. And also we are not doing more computation
> > because we already retrive
> > all records with same hashes.
> > What do you think of this idea?.
> > And there is one more problem how to make it foreign key.
> >
> > Will send you a prototype code tomorrow.
> > Regards
> > sachin
> >
> > On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
> wrote:
> >> Hi, Sachin!
> >>
> >> On Apr 13, Sachin Setia wrote:
> >>> Hello Sergei
> >>> Sorry I did not see your mail. Actually i was thinking something like
> >>> this before implementing the prototype but if i am more closer to
> >>> innodb the more performance i will i get. I will definitively think
> >>> about it.
> >>
> >> Great!
> >>
> >> Could you please tell me (mailing list, that is) what you think before
> >> next Monday (before April 18h, that is)?
> >>
> >> Regards,
> >> Sergei
> >> Chief Architect MariaDB
> >> and security@xxxxxxxxxxx
>
>
> ---------- Forwarded message ----------
> From: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> To: maria-developers@xxxxxxxxxxxxxxxxxxx
> Cc:
> Date: Mon, 18 Apr 2016 20:12:39 +0530
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hello Sergei!
> Hi Actually I was thinking about how to implement
> blob as a foreign key.Foreign has to be unique which
> we can already implement. To make it foreign key
> we can either store unique hash or the whole blob
> column.
> But I am not sure much people
> want to copy so long blob data in reference table.
> Second option would be use blob hash as a reference key.
> But user can not directly us hash as a reference key
> because that is hidden.
> What I was thinking of a clear to programmer way of
> using blob hash. Suppose user can directly create
> blob hash column ,use that column as a primary key or
> foreign key.Like
> create table t1(abc blob , blob hash(abc))//this will just create blob
> hash column
> create table t1(abc blob,unique(blob hash(abc))) // this will create
> unique blob hash column
> and similar for primary key and foreign key
> user can enter hash value if they have some good algorithm
> or if they do not give any value we will automatically
> create and store hash. What do you think? sir.
> Regards
> sachin
>
>
>
> On Sat, Apr 16, 2016 at 7:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
> wrote:
>
>> Hi Sergei!
>> As I already told you i was building prototype.It is some what completed
>> apart from one thing comparing of two field values. the difficulty is how
>> to get data length of field from table->record[1]. I will try to solve
>> it.
>> One more thing actually i got how mysql hide field. For example
>> condsider three
>> fields hash,data,data. mysql field pointer point at second field not at
>> hash
>> field and hash field ptr is stored in table->hash_field
>> can we do something similar to store hash fields(if we make array of
>> hashes in
>> case of more than one unique).But will adding member variable cause
>> problem?
>> what do you think?
>> Regards
>> sachin
>>
>> On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@xxxxxxxxx>
>> wrote:
>> > Hi Sergei!
>> >
>> > Actually I was going through the mysql source code for unique long
>> constraints
>> > in file sql_tmp_table.cc in function create_tmp_table they make a new
>> field
>> > and a new key(hash_key) and pass this table obejct to storage
>> > engine.They actually
>> > refer this field as a hash field
>> > On the time of insert they call bool check_unique_constraint(TABLE
>> > *table) function
>> > which first calculate the hash and store it in field then they see for
>> duplicate
>> > hash and retrive ha_index_next_same if records are not same then record
>> >
>> > We can do the same thing in mariadb by adding one more field and key in
>> > mysql_prepare_create_table in this we check for blob with unlimited
>> > length or varchar for length
>> > greater then internal storage engine by doing this in
>> > mysql_prepare_create_table there
>> > will be no issues of frm file inconsistance.
>> >
>> > In case of insert first we will fill the hash field in fill_record
>> > function of sql_base.cc
>> > by first calculating the hash. Then we will retrive the index map
>> > using ha_index_read_map
>> > if returened value is zero then we will comapare two records and if
>> > they match then we will through error
>> > I am not sure where to place this code either in fill_record or later
>> > Or i can simple just
>> > fill hash in field in fill_record and then check for duplicates later
>> on.
>> >
>> > Current I am not sure how to hide columns from user.Sir, can you
>> > suggest me where to look
>> >
>> > But there is one problem we can make unique key by this approch but
>> > not primary key because primary key
>> > is clustered and hashes can collide so i think we can't use hash field
>> > as primary key. To overcome this problem
>> > I have one idea instead of storing just hash we can make hash field
>> > length 10 bytes and in last two bytes
>> > we can store short int which tells how much time hash is repeated this
>> > can make hash unique
>> > in case of collusion. And also we are not doing more computation
>> > because we already retrive
>> > all records with same hashes.
>> > What do you think of this idea?.
>> > And there is one more problem how to make it foreign key.
>> >
>> > Will send you a prototype code tomorrow.
>> > Regards
>> > sachin
>> >
>> > On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@xxxxxxxxxxx>
>> wrote:
>> >> Hi, Sachin!
>> >>
>> >> On Apr 13, Sachin Setia wrote:
>> >>> Hello Sergei
>> >>> Sorry I did not see your mail. Actually i was thinking something like
>> >>> this before implementing the prototype but if i am more closer to
>> >>> innodb the more performance i will i get. I will definitively think
>> >>> about it.
>> >>
>> >> Great!
>> >>
>> >> Could you please tell me (mailing list, that is) what you think before
>> >> next Monday (before April 18h, that is)?
>> >>
>> >> Regards,
>> >> Sergei
>> >> Chief Architect MariaDB
>> >> and security@xxxxxxxxxxx
>>
>
>
>
> ---------- Forwarded message ----------
> From: Sergei Golubchik <serg@xxxxxxxxxxx>
> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
> Date: Mon, 18 Apr 2016 22:32:25 +0200
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi, Sachin!
>
> On Apr 15, Sachin Setia wrote:
> > Hi Sergei!
> >
> > Actually I was going through the mysql source code for unique long
> > constraints in file sql_tmp_table.cc in function create_tmp_table they
> > make a new field and a new key(hash_key) and pass this table obejct to
> > storage engine.They actually refer this field as a hash field On the
> > time of insert they call bool check_unique_constraint(TABLE *table)
> > function which first calculate the hash and store it in field then
> > they see for duplicate hash and retrive ha_index_next_same if records
> > are not same then record
>
> Right. Very good!
>
> > We can do the same thing in mariadb by adding one more field and key in
> > mysql_prepare_create_table in this we check for blob with unlimited
> > length or varchar for length greater then internal storage engine by
> > doing this in mysql_prepare_create_table there will be no issues of frm
> > file inconsistance.
> >
> > In case of insert first we will fill the hash field in fill_record
> > function of sql_base.cc by first calculating the hash. Then we will
> > retrive the index map using ha_index_read_map if returened value is zero
> > then we will comapare two records and if they match then we will through
> > error I am not sure where to place this code either in fill_record or
> > later Or i can simple just fill hash in field in fill_record and then
> > check for duplicates later on.
>
> MariaDB supports "virtual columns", see
> https://mariadb.com/kb/en/mariadb/virtual-computed-columns/
>
> So, it might be enough to mark this hash column as virtual, and will be
> automatically calculated when necessary (in fill_record, etc).
>
> > Current I am not sure how to hide columns from user.Sir, can you suggest
> > me where to look
>
> This would need a new flag per field, like "hidden". And in all commands
> that use a list of columns (SHOW, INFORMATION_SCHEMA tables, SELECT *,
> INSERT table VALUE (...), etc) - this column should be skipped.
>
> But don't worry, I don't think this will be a problem for you. You can,
> of course, start from not hiding this column and implement this hiddden
> flag later.
>
> > But there is one problem we can make unique key by this approch but not
> > primary key because primary key is clustered and hashes can collide so i
> > think we can't use hash field as primary key.
>
> Of course.
>
> > To overcome this problem I have one idea instead of storing just hash
> > we can make hash field length 10 bytes and in last two bytes we can
> > store short int which tells how much time hash is repeated this can
> > make hash unique in case of collusion. And also we are not doing more
> > computation because we already retrive all records with same hashes.
> > What do you think of this idea?. And there is one more problem how to
> > make it foreign key.
>
> I would say, there is no need to bother. Let's just say that UNIQUE for
> blobs is only a constraint, can not be used as a primary key, can not be
> used as a foreign key. This is a reasonable limitation, I think :)
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
>
>
> ---------- Forwarded message ----------
> From: Sergei Golubchik <serg@xxxxxxxxxxx>
> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
> Date: Mon, 18 Apr 2016 22:46:42 +0200
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi, Sachin!
>
> On Apr 16, Sachin Setia wrote:
> > Hi Sergei!
> > As I already told you i was building prototype.It is some what
> > completed apart from one thing comparing of two field values. the
> > difficulty is how to get data length of field from table->record[1].
> > I will try to solve it.
>
> Do you mean, a length of the blob value?
> Try field->get_length()
>
> > One more thing actually i got how mysql hide field. For example
> > condsider three fields hash,data,data. mysql field pointer point at
> > second field not at hash field and hash field ptr is stored in
> > table->hash_field can we do something similar to store hash fields(if
> > we make array of hashes in case of more than one unique).But will
> > adding member variable cause problem? what do you think?
>
> Well, yes, that would work.
> But I see few other features where field hiding might be useful,
> and in these cases this simple trick with moving table->field pointer
> will not work, unfortunately.
>
> So we might need something more complex. But, again, this is not a most
> important part of this project, so don't start from it.
>
> In fact, I'm rather interested to know about the adaptive hashing that
> you've mentioned. Can you tell me more about it, please?
>
> > diff --git a/sql/sql_base.cc b/sql/sql_base.cc
> > index ac2162b..291a3e2 100644
> > --- a/sql/sql_base.cc
> > +++ b/sql/sql_base.cc
> > @@ -8956,6 +8964,31 @@ fill_record(THD *thd, TABLE *table, Field **ptr,
> List<Item> &values,
> > goto err;
> > field->set_explicit_default(value);
> > }
> > + table->file->ha_index_init(0,0);
> > + res =
> table->file->ha_index_read_map(table->record[1],hash_field->ptr,HA_WHOLE_KEY,HA_READ_KEY_EXACT);
> > + while(!res){
> > + //compare the record if not sure how to compare it so just
> assume it works
> > + diff = table->record[1]-table->record[0];
> > + src_length = blob_field->data_length();
> > + //dest_length =
> blob_field->data_length(table->record[1]); // i dont know how to get the
> length from record 1
> > + // so i am enable to do this
> > + // then we can comapare records using
> > + //field->cmp_max
> > + //this is mysql code
> > + /* if (!(table->distinct ?
> > + table_rec_cmp(table) :
> > + group_rec_cmp(table->group, table->record[0],
> table->record[1])))
> > + return false; // skip it
> > + res= table->file->ha_index_next_same(table->record[1],
> > + table->hash_field->ptr,
> > + sizeof(hash)); */
> > + //fetch the next record
> > + res= table->file->ha_index_next_same(table->record[1],
> > + hash_field->ptr,
> > + 8);
> > +
> > + }
>
> not quite, you should check unique constraints in fill_record(). This
> should happen when the row is actually inserted or updated. A good place
> for this check, I'd say, is handler::ha_write_row() and
> handler::ha_update_row().
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
>
>
> ---------- Forwarded message ----------
> From: Sergei Golubchik <serg@xxxxxxxxxxx>
> To: Sachin Setia <sachinsetia1001@xxxxxxxxx>
> Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
> Date: Mon, 18 Apr 2016 22:51:57 +0200
> Subject: Re: [Maria-developers] InnoDB blob for primary key
> Hi, Sachin!
>
> On Apr 18, Sachin Setia wrote:
> > Hi Actually I was thinking about how to implement blob as a foreign
> > key.Foreign has to be unique which we can already implement. To make
> > it foreign key we can either store unique hash or the whole blob
> > column. But I am not sure much people want to copy so long blob data
> > in reference table.
>
> Agree :)
>
> > Second option would be use blob hash as a reference key.
> > But user can not directly us hash as a reference key
> > because that is hidden.
> > What I was thinking of a clear to programmer way of
> > using blob hash. Suppose user can directly create
> > blob hash column ,use that column as a primary key or
> > foreign key.Like
> > create table t1(abc blob , blob hash(abc))//this will just create blob
> hash
> > column
> > create table t1(abc blob,unique(blob hash(abc))) // this will create
> unique
> > blob hash column
> > and similar for primary key and foreign key
> > user can enter hash value if they have some good algorithm
> > or if they do not give any value we will automatically
> > create and store hash. What do you think? sir.
>
> Mixed feelings. First, I wrote in an earlier email about using virtual
> columns for that. In this line of thoughts, a user-specified hash
> function is absolutely possible and logical.
>
> On the other hand, I don't see why anyone would need that - a hash is
> not guaranteed to be unique, no matter what algorithm one uses.
>
> And, as I wrote in an earlier email, I don't think that primary/foreign
> keys on blobs is a feature worth spending time on.
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
>
>
Follow ups
References
-
InnoDB blob for primary key
From: Sachin Setia, 2016-04-11
-
Re: InnoDB blob for primary key
From: Jan Lindström, 2016-04-12
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-13
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-13
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-15
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-16
-
Re: InnoDB blob for primary key
From: Sachin Setia, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18
-
Re: InnoDB blob for primary key
From: Sergei Golubchik, 2016-04-18