← Back to team overview

maria-developers team mailing list archive

Re: Implementing implicit primary key in mysql server

 

it makes hardly sense to add a primary key not used
in select statements this way and leads only in a
lot of problems and wasted space / performance

it is common practice to remove keys before large
bulk inserts and add the key *after* inserting the
data which would not work with the expected benefit
with your patch

Am 05.07.2013 06:08, schrieb Lixun Peng:
> Hi all,
> 
> I implement a demo patch, based on 5.5.18.
> 
> 
> *1. CREATE TABLE*
> root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32));
> Query OK, 0 rows affected (0.01 sec)
> 
> root@localhost : plx 11:55:05> desc test_no_pk;
> +----------+-------------+------+-----+---------+----------------+
> | Field    | Type        | Null | Key | Default | Extra          |
> +----------+-------------+------+-----+---------+----------------+
> | col1     | varchar(32) | YES  |     | NULL    |                |
> | __row_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
> +----------+-------------+------+-----+---------+----------------+
> 2 rows in set (0.01 sec)
> 
> if users has not defined a PK, I will add it automatically.
> 
> *2. ALTER TABLE*
> root@localhost : plx 11:55:10> alter table test_no_pk add id int, add primary key(id);
> Query OK, 0 rows affected (0.00 sec)
> Records: 0  Duplicates: 0  Warnings: 0
> 
> root@localhost : plx 11:57:02> desc test_no_pk;
> +-------+-------------+------+-----+---------+-------+
> | Field | Type        | Null | Key | Default | Extra |
> +-------+-------------+------+-----+---------+-------+
> | col1  | varchar(32) | YES  |     | NULL    |       |
> | id    | int(11)     | NO   | PRI | 0       |       |
> +-------+-------------+------+-----+---------+-------+
> 2 rows in set (0.01 sec)
> 
> When users add a PK, I will remove implicit PK automatically.
> 
> root@localhost : plx 11:57:07> alter table test_no_pk drop primary key;
> Query OK, 0 rows affected (0.00 sec)
> Records: 0  Duplicates: 0  Warnings: 0
> 
> root@localhost : plx 11:57:42> desc test_no_pk;
> +----------+-------------+------+-----+---------+----------------+
> | Field    | Type        | Null | Key | Default | Extra          |
> +----------+-------------+------+-----+---------+----------------+
> | col1     | varchar(32) | YES  |     | NULL    |                |
> | id       | int(11)     | NO   |     | 0       |                |
> | __row_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
> +----------+-------------+------+-----+---------+----------------+
> 3 rows in set (0.00 sec)
> 
> When users dropped PK, I will add it automatically.
> 
> *3. INSERT VALUES*
> root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2);
> ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
> root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4);
> ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
> 
> it will report duplicate, *Sergei, can you help me to find why?*
> 
> *4. SELECT **
> root@localhost : plx 12:07:23> select * from test_no_pk;
> +------+----+
> | col1 | id |
> +------+----+
> | abc  |  6 |
> +------+----+
> 1 row in set (0.00 sec)
> 
> root@localhost : plx 12:07:30> select __row_id from test_no_pk;
> +----------+
> | __row_id |
> +----------+
> |        1 |
> +----------+
> 1 row in set (0.00 sec)
> 
> When users run "SELECT *", row_id will be filter.
> 
> *5. SHOW CREATE*
> 
> root@localhost : plx 12:07:35> show create table test_no_pk\G
> *************************** 1. row ***************************
>        Table: test_no_pk
> Create Table: CREATE TABLE `test_no_pk` (
>   `col1` varchar(32) DEFAULT NULL,
>   `id` int(11) NOT NULL DEFAULT '0'
> ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
> 1 row in set (0.00 sec)
> 
> row_id will be hidden.
> 
> 
> Thanks,
> Lixun
> 
> 
> 
> 
> On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@xxxxxxxxx <mailto:penglixun@xxxxxxxxx>> wrote:
> 
>     Hi Jeremy,
> 
>     Thank you for your suggestion.
>     I also want to just add the PK field for custom automatically, but some of our customs can't accept it.
>     Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK
>     for them, "SELECT *" will show this value, then their applications will report errors.
>     So I have to set this field as an implicit filed.
> 
>     Thanks,
>     Lixun
> 
> 
> 
>     On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@xxxxxxxxxx <mailto:jeremycole@xxxxxxxxxx>> wrote:
> 
>         Lixun,
> 
>         I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more
>         complicated than it might seem). While I think it is not that hard to expose the __id field to replication
>         internals, I think in order for this to really work it would need to be exposed to other tools, such as
>         mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it
>         is in sync between master and slave).
> 
>         As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically
>         adding a field using auto_increment when the user doesn't provide a PK:
> 
>         __id BIGINT UNSIGNED NOT NULL auto_increment,
>         PRIMARY KEY(__id)
> 
>         Regards,
> 
>         Jeremy
> 
> 
>         On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@xxxxxxxxx <mailto:penglixun@xxxxxxxxx>> wrote:
> 
>             Hi Sergei,
> 
>             You are right, let users add primary key is best.
>             But I can't let users who don't want to create primary key can't use our MySQL service.
>             Amazon RDS also allow users to create the tables without primary key, just change binlog_format to
>             MIXED to solve replication problem.
> 
>             I think this feature is very useful for any MySQL cloud service providers, all of them will face this
>             problem in one day.
>             I will try to do some analysis/research in this feature implement first, I will need your help :-)
> 
>             I will notice any new updates in this email.
> 
> 
>             Thanks,
>             Lixun
>              
> 
> 
> 
>             On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@xxxxxxxxxxx <mailto:serg@xxxxxxxxxxx>> wrote:
> 
>                 Hi, Lixun!
> 
> 
>                 On Jun 18, Lixun Peng wrote:
>                 > Hi,
>                 >
>                 > As we know, InnoDB has implicit primary key if a table hasn't defined
>                 > a primary key.  However MySQL server doesn't know this primary key, so
>                 > this primary key will not apear in binlog.
>                 >
>                 > When we are using ROW format binlog for replication, if a table has no any
>                 > indexes, that's a disaster. If a table without indexes do a DML
>                 > (UPDATE/DELETE), of course it will run a long time in master, but in slave,
>                 > it still need a long time. It will cause serious slave replication delay.
>                 ...
> 
>                 > I think mysql server can do the same thing as InnoDB do, if user doesn't
>                 > define the primary key, mysql can add the primary key automatically.
>                 >
>                 > How do you think?
> 
>                 Well, that's doable. A much easier solution would be to require a user
>                 to create a primary key. It's a one-line change:
> 
>                 -  Table_flags ha_table_flags() const { return cached_table_flags; }
>                 +  Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
> 
>                 But what you suggest is possible too, I believe.
> 
>                 Regards,
>                 Sergei

Attachment: signature.asc
Description: OpenPGP digital signature


Follow ups

References