← Back to team overview

maria-developers team mailing list archive

Re: InnoDB blob for primary key

 

Hi, Sachin!

On Apr 15, Sachin Setia wrote:
> Hi Sergei!
> 
> Actually I was going through the mysql source code for unique long
> constraints in file sql_tmp_table.cc in function create_tmp_table they
> make a new field and a new key(hash_key) and pass this table obejct to
> storage engine.They actually refer this field as a hash field On the
> time of insert they call bool check_unique_constraint(TABLE *table)
> function which first calculate the hash and store it in field then
> they see for duplicate hash and retrive ha_index_next_same if records
> are not same then record

Right. Very good!

> We can do the same thing in mariadb by adding one more field and key in
> mysql_prepare_create_table in this we check for blob with unlimited
> length or varchar for length greater then internal storage engine by
> doing this in mysql_prepare_create_table there will be no issues of frm
> file inconsistance.
> 
> In case of insert first we will fill the hash field in fill_record
> function of sql_base.cc by first calculating the hash. Then we will
> retrive the index map using ha_index_read_map if returened value is zero
> then we will comapare two records and if they match then we will through
> error I am not sure where to place this code either in fill_record or
> later Or i can simple just fill hash in field in fill_record and then
> check for duplicates later on.

MariaDB supports "virtual columns", see
https://mariadb.com/kb/en/mariadb/virtual-computed-columns/

So, it might be enough to mark this hash column as virtual, and will be
automatically calculated when necessary (in fill_record, etc).

> Current I am not sure how to hide columns from user.Sir, can you suggest
> me where to look

This would need a new flag per field, like "hidden". And in all commands
that use a list of columns (SHOW, INFORMATION_SCHEMA tables, SELECT *,
INSERT table VALUE (...), etc) - this column should be skipped.

But don't worry, I don't think this will be a problem for you. You can,
of course, start from not hiding this column and implement this hiddden
flag later.

> But there is one problem we can make unique key by this approch but not
> primary key because primary key is clustered and hashes can collide so i
> think we can't use hash field as primary key.

Of course.

> To overcome this problem I have one idea instead of storing just hash
> we can make hash field length 10 bytes and in last two bytes we can
> store short int which tells how much time hash is repeated this can
> make hash unique in case of collusion. And also we are not doing more
> computation because we already retrive all records with same hashes.
> What do you think of this idea?.  And there is one more problem how to
> make it foreign key.

I would say, there is no need to bother. Let's just say that UNIQUE for
blobs is only a constraint, can not be used as a primary key, can not be
used as a foreign key. This is a reasonable limitation, I think :)

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


References