← Back to team overview

maria-developers team mailing list archive

Re: Implementing implicit primary key in mysql server

 

Yes, for normal user is un-meaningful.

The case is our MySQL Cloud Service, so many users are using our MySQL db
for CMS or other programs.
And it usually has no primary key or any unique keys, so it makes me
headache.
I think you know, if binlog_format=ROW, and tables have no any unique keys,
what will happen.
Now we just change binlog_forma=MIXED to avoid this problem. But our
middleware need ROW format, so it still makes me headache.
Because our users don't want to add PK by themselves, they don't know how
to modify their application, they just download it and install in their web
server.

So I want to add a implicit Primary Key for each tables that have no unique
keys. Then we can use ROW binlog format, so many problems will be  solved.



On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald <h.reindl@xxxxxxxxxxxxx>wrote:

> it makes hardly sense to add a primary key not used
> in select statements this way and leads only in a
> lot of problems and wasted space / performance
>
> it is common practice to remove keys before large
> bulk inserts and add the key *after* inserting the
> data which would not work with the expected benefit
> with your patch
>
> Am 05.07.2013 06:08, schrieb Lixun Peng:
> > Hi all,
> >
> > I implement a demo patch, based on 5.5.18.
> >
> >
> > *1. CREATE TABLE*
> > root@localhost : plx 11:54:46> create table test_no_pk (col1
> varchar(32));
> > Query OK, 0 rows affected (0.01 sec)
> >
> > root@localhost : plx 11:55:05> desc test_no_pk;
> > +----------+-------------+------+-----+---------+----------------+
> > | Field    | Type        | Null | Key | Default | Extra          |
> > +----------+-------------+------+-----+---------+----------------+
> > | col1     | varchar(32) | YES  |     | NULL    |                |
> > | __row_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
> > +----------+-------------+------+-----+---------+----------------+
> > 2 rows in set (0.01 sec)
> >
> > if users has not defined a PK, I will add it automatically.
> >
> > *2. ALTER TABLE*
> > root@localhost : plx 11:55:10> alter table test_no_pk add id int, add
> primary key(id);
> > Query OK, 0 rows affected (0.00 sec)
> > Records: 0  Duplicates: 0  Warnings: 0
> >
> > root@localhost : plx 11:57:02> desc test_no_pk;
> > +-------+-------------+------+-----+---------+-------+
> > | Field | Type        | Null | Key | Default | Extra |
> > +-------+-------------+------+-----+---------+-------+
> > | col1  | varchar(32) | YES  |     | NULL    |       |
> > | id    | int(11)     | NO   | PRI | 0       |       |
> > +-------+-------------+------+-----+---------+-------+
> > 2 rows in set (0.01 sec)
> >
> > When users add a PK, I will remove implicit PK automatically.
> >
> > root@localhost : plx 11:57:07> alter table test_no_pk drop primary key;
> > Query OK, 0 rows affected (0.00 sec)
> > Records: 0  Duplicates: 0  Warnings: 0
> >
> > root@localhost : plx 11:57:42> desc test_no_pk;
> > +----------+-------------+------+-----+---------+----------------+
> > | Field    | Type        | Null | Key | Default | Extra          |
> > +----------+-------------+------+-----+---------+----------------+
> > | col1     | varchar(32) | YES  |     | NULL    |                |
> > | id       | int(11)     | NO   |     | 0       |                |
> > | __row_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
> > +----------+-------------+------+-----+---------+----------------+
> > 3 rows in set (0.00 sec)
> >
> > When users dropped PK, I will add it automatically.
> >
> > *3. INSERT VALUES*
> > root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2);
> > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
> > root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4);
> > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
> >
> > it will report duplicate, *Sergei, can you help me to find why?*
> >
> > *4. SELECT **
> > root@localhost : plx 12:07:23> select * from test_no_pk;
> > +------+----+
> > | col1 | id |
> > +------+----+
> > | abc  |  6 |
> > +------+----+
> > 1 row in set (0.00 sec)
> >
> > root@localhost : plx 12:07:30> select __row_id from test_no_pk;
> > +----------+
> > | __row_id |
> > +----------+
> > |        1 |
> > +----------+
> > 1 row in set (0.00 sec)
> >
> > When users run "SELECT *", row_id will be filter.
> >
> > *5. SHOW CREATE*
> >
> > root@localhost : plx 12:07:35> show create table test_no_pk\G
> > *************************** 1. row ***************************
> >        Table: test_no_pk
> > Create Table: CREATE TABLE `test_no_pk` (
> >   `col1` varchar(32) DEFAULT NULL,
> >   `id` int(11) NOT NULL DEFAULT '0'
> > ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
> > 1 row in set (0.00 sec)
> >
> > row_id will be hidden.
> >
> >
> > Thanks,
> > Lixun
> >
> >
> >
> >
> > On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@xxxxxxxxx<mailto:
> penglixun@xxxxxxxxx>> wrote:
> >
> >     Hi Jeremy,
> >
> >     Thank you for your suggestion.
> >     I also want to just add the PK field for custom automatically, but
> some of our customs can't accept it.
> >     Because they are using "SELECT * FROM table .... " or " INSERT INTO
> table VALUES(...) ", if I add a visible PK
> >     for them, "SELECT *" will show this value, then their applications
> will report errors.
> >     So I have to set this field as an implicit filed.
> >
> >     Thanks,
> >     Lixun
> >
> >
> >
> >     On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@xxxxxxxxxx<mailto:
> jeremycole@xxxxxxxxxx>> wrote:
> >
> >         Lixun,
> >
> >         I've thought about this a bit and I'm not sure this will be very
> simple to do (or rather it's more
> >         complicated than it might seem). While I think it is not that
> hard to expose the __id field to replication
> >         internals, I think in order for this to really work it would
> need to be exposed to other tools, such as
> >         mysqldump. It is also unclear how to decide when it is safe to
> use this __id field (how to determine if it
> >         is in sync between master and slave).
> >
> >         As an alternate suggestion, what about ignoring the current
> implicit PK behavior, and instead automatically
> >         adding a field using auto_increment when the user doesn't
> provide a PK:
> >
> >         __id BIGINT UNSIGNED NOT NULL auto_increment,
> >         PRIMARY KEY(__id)
> >
> >         Regards,
> >
> >         Jeremy
> >
> >
> >         On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@xxxxxxxxx<mailto:
> penglixun@xxxxxxxxx>> wrote:
> >
> >             Hi Sergei,
> >
> >             You are right, let users add primary key is best.
> >             But I can't let users who don't want to create primary key
> can't use our MySQL service.
> >             Amazon RDS also allow users to create the tables without
> primary key, just change binlog_format to
> >             MIXED to solve replication problem.
> >
> >             I think this feature is very useful for any MySQL cloud
> service providers, all of them will face this
> >             problem in one day.
> >             I will try to do some analysis/research in this feature
> implement first, I will need your help :-)
> >
> >             I will notice any new updates in this email.
> >
> >
> >             Thanks,
> >             Lixun
> >
> >
> >
> >
> >             On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <
> serg@xxxxxxxxxxx <mailto:serg@xxxxxxxxxxx>> wrote:
> >
> >                 Hi, Lixun!
> >
> >
> >                 On Jun 18, Lixun Peng wrote:
> >                 > Hi,
> >                 >
> >                 > As we know, InnoDB has implicit primary key if a table
> hasn't defined
> >                 > a primary key.  However MySQL server doesn't know this
> primary key, so
> >                 > this primary key will not apear in binlog.
> >                 >
> >                 > When we are using ROW format binlog for replication,
> if a table has no any
> >                 > indexes, that's a disaster. If a table without indexes
> do a DML
> >                 > (UPDATE/DELETE), of course it will run a long time in
> master, but in slave,
> >                 > it still need a long time. It will cause serious slave
> replication delay.
> >                 ...
> >
> >                 > I think mysql server can do the same thing as InnoDB
> do, if user doesn't
> >                 > define the primary key, mysql can add the primary key
> automatically.
> >                 >
> >                 > How do you think?
> >
> >                 Well, that's doable. A much easier solution would be to
> require a user
> >                 to create a primary key. It's a one-line change:
> >
> >                 -  Table_flags ha_table_flags() const { return
> cached_table_flags; }
> >                 +  Table_flags ha_table_flags() const { return
> cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
> >
> >                 But what you suggest is possible too, I believe.
> >
> >                 Regards,
> >                 Sergei
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-developers
> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-developers
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Senior Database Engineer @ Taobao.com

Mobile Phone: +86 18658156856 (Hangzhou)
Gtalk: penglixun(at)gmail.com
Twitter: http://www.twitter.com/plinux
Blog: http://www.penglixun.com

Follow ups

References