← Back to team overview

maria-developers team mailing list archive

Re: Implementing implicit primary key in mysql server

 

Hi All,

New patch about this feature, fixed "*INSERT VALUES*" bug.
Adding a new option "implicit_primary_key", when this feature is un-useful,
can set implicit_primary_key=0.



On Fri, Jul 5, 2013 at 12:42 PM, Reindl Harald <h.reindl@xxxxxxxxxxxxx>wrote:

>
>
> Am 05.07.2013 06:32, schrieb Lixun Peng:
> > Yes, for normal user is un-meaningful
>
> it is not only un-meaningful it is harmful for people knwoing what they
> are doing by massive overhead with no benefit - hence after a bulk insert
> your implicit key has to be removed while as example the intented unique
> key on a varchar added
>
> > The case is our MySQL Cloud Service, so many users are using our MySQL
> db for CMS or other programs.
> > And it usually has no primary key or any unique keys, so it makes me
> headache.
>
> i doubt you can solve social problems with tech
>
> > I think you know, if binlog_format=ROW, and tables have no any unique
> keys, what will happen.
> > Now we just change binlog_forma=MIXED to avoid this problem. But our
> middleware need ROW format, so it still makes
> > me headache.
> > Because our users don't want to add PK by themselves, they don't know
> how to modify their application, they just
> > download it and install in their web server.
> >
> > So I want to add a implicit Primary Key for each tables that have no
> unique keys. Then we can use ROW binlog
> > format, so many problems will be solved.
>
> corner cases - this at least needs to be enabled via "my.cnf" and must not
> affect users with well database designs and shoot them in the leg due
> bulk inserts
>
> > On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald <h.reindl@xxxxxxxxxxxxx<mailto:
> h.reindl@xxxxxxxxxxxxx>> wrote:
> >
> >     it makes hardly sense to add a primary key not used
> >     in select statements this way and leads only in a
> >     lot of problems and wasted space / performance
> >
> >     it is common practice to remove keys before large
> >     bulk inserts and add the key *after* inserting the
> >     data which would not work with the expected benefit
> >     with your patch
> >
> >     Am 05.07.2013 06:08, schrieb Lixun Peng:
> >     > Hi all,
> >     >
> >     > I implement a demo patch, based on 5.5.18.
> >     >
> >     >
> >     > *1. CREATE TABLE*
> >     > root@localhost : plx 11:54:46> create table test_no_pk (col1
> varchar(32));
> >     > Query OK, 0 rows affected (0.01 sec)
> >     >
> >     > root@localhost : plx 11:55:05> desc test_no_pk;
> >     > +----------+-------------+------+-----+---------+----------------+
> >     > | Field    | Type        | Null | Key | Default | Extra          |
> >     > +----------+-------------+------+-----+---------+----------------+
> >     > | col1     | varchar(32) | YES  |     | NULL    |                |
> >     > | __row_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
> >     > +----------+-------------+------+-----+---------+----------------+
> >     > 2 rows in set (0.01 sec)
> >     >
> >     > if users has not defined a PK, I will add it automatically.
> >     >
> >     > *2. ALTER TABLE*
> >     > root@localhost : plx 11:55:10> alter table test_no_pk add id int,
> add primary key(id);
> >     > Query OK, 0 rows affected (0.00 sec)
> >     > Records: 0  Duplicates: 0  Warnings: 0
> >     >
> >     > root@localhost : plx 11:57:02> desc test_no_pk;
> >     > +-------+-------------+------+-----+---------+-------+
> >     > | Field | Type        | Null | Key | Default | Extra |
> >     > +-------+-------------+------+-----+---------+-------+
> >     > | col1  | varchar(32) | YES  |     | NULL    |       |
> >     > | id    | int(11)     | NO   | PRI | 0       |       |
> >     > +-------+-------------+------+-----+---------+-------+
> >     > 2 rows in set (0.01 sec)
> >     >
> >     > When users add a PK, I will remove implicit PK automatically.
> >     >
> >     > root@localhost : plx 11:57:07> alter table test_no_pk drop
> primary key;
> >     > Query OK, 0 rows affected (0.00 sec)
> >     > Records: 0  Duplicates: 0  Warnings: 0
> >     >
> >     > root@localhost : plx 11:57:42> desc test_no_pk;
> >     > +----------+-------------+------+-----+---------+----------------+
> >     > | Field    | Type        | Null | Key | Default | Extra          |
> >     > +----------+-------------+------+-----+---------+----------------+
> >     > | col1     | varchar(32) | YES  |     | NULL    |                |
> >     > | id       | int(11)     | NO   |     | 0       |                |
> >     > | __row_id | bigint(20)  | NO   | PRI | NULL    | auto_increment |
> >     > +----------+-------------+------+-----+---------+----------------+
> >     > 3 rows in set (0.00 sec)
> >     >
> >     > When users dropped PK, I will add it automatically.
> >     >
> >     > *3. INSERT VALUES*
> >     > root@localhost : plx 11:59:22> insert into test_no_pk
> values('abc',2);
> >     > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
> >     > root@localhost : plx 11:59:23> insert into test_no_pk
> values('abc',4);
> >     > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
> >     >
> >     > it will report duplicate, *Sergei, can you help me to find why?*
> >     >
> >     > *4. SELECT **
> >     > root@localhost : plx 12:07:23> select * from test_no_pk;
> >     > +------+----+
> >     > | col1 | id |
> >     > +------+----+
> >     > | abc  |  6 |
> >     > +------+----+
> >     > 1 row in set (0.00 sec)
> >     >
> >     > root@localhost : plx 12:07:30> select __row_id from test_no_pk;
> >     > +----------+
> >     > | __row_id |
> >     > +----------+
> >     > |        1 |
> >     > +----------+
> >     > 1 row in set (0.00 sec)
> >     >
> >     > When users run "SELECT *", row_id will be filter.
> >     >
> >     > *5. SHOW CREATE*
> >     >
> >     > root@localhost : plx 12:07:35> show create table test_no_pk\G
> >     > *************************** 1. row ***************************
> >     >        Table: test_no_pk
> >     > Create Table: CREATE TABLE `test_no_pk` (
> >     >   `col1` varchar(32) DEFAULT NULL,
> >     >   `id` int(11) NOT NULL DEFAULT '0'
> >     > ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
> >     > 1 row in set (0.00 sec)
> >     >
> >     > row_id will be hidden.
> >     >
> >     >
> >     > Thanks,
> >     > Lixun
> >     >
> >     >
> >     >
> >     >
> >     > On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@xxxxxxxxx<mailto:
> penglixun@xxxxxxxxx>
> >     <mailto:penglixun@xxxxxxxxx <mailto:penglixun@xxxxxxxxx>>> wrote:
> >     >
> >     >     Hi Jeremy,
> >     >
> >     >     Thank you for your suggestion.
> >     >     I also want to just add the PK field for custom automatically,
> but some of our customs can't accept it.
> >     >     Because they are using "SELECT * FROM table .... " or " INSERT
> INTO table VALUES(...) ", if I add a
> >     visible PK
> >     >     for them, "SELECT *" will show this value, then their
> applications will report errors.
> >     >     So I have to set this field as an implicit filed.
> >     >
> >     >     Thanks,
> >     >     Lixun
> >     >
> >     >
> >     >
> >     >     On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <
> jeremycole@xxxxxxxxxx <mailto:jeremycole@xxxxxxxxxx>
> >     <mailto:jeremycole@xxxxxxxxxx <mailto:jeremycole@xxxxxxxxxx>>>
> wrote:
> >     >
> >     >         Lixun,
> >     >
> >     >         I've thought about this a bit and I'm not sure this will
> be very simple to do (or rather it's more
> >     >         complicated than it might seem). While I think it is not
> that hard to expose the __id field to
> >     replication
> >     >         internals, I think in order for this to really work it
> would need to be exposed to other tools, such as
> >     >         mysqldump. It is also unclear how to decide when it is
> safe to use this __id field (how to determine
> >     if it
> >     >         is in sync between master and slave).
> >     >
> >     >         As an alternate suggestion, what about ignoring the
> current implicit PK behavior, and instead
> >     automatically
> >     >         adding a field using auto_increment when the user doesn't
> provide a PK:
> >     >
> >     >         __id BIGINT UNSIGNED NOT NULL auto_increment,
> >     >         PRIMARY KEY(__id)
> >     >
> >     >         Regards,
> >     >
> >     >         Jeremy
> >     >
> >     >
> >     >         On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <
> penglixun@xxxxxxxxx <mailto:penglixun@xxxxxxxxx>
> >     <mailto:penglixun@xxxxxxxxx <mailto:penglixun@xxxxxxxxx>>> wrote:
> >     >
> >     >             Hi Sergei,
> >     >
> >     >             You are right, let users add primary key is best.
> >     >             But I can't let users who don't want to create primary
> key can't use our MySQL service.
> >     >             Amazon RDS also allow users to create the tables
> without primary key, just change binlog_format to
> >     >             MIXED to solve replication problem.
> >     >
> >     >             I think this feature is very useful for any MySQL
> cloud service providers, all of them will face this
> >     >             problem in one day.
> >     >             I will try to do some analysis/research in this
> feature implement first, I will need your help :-)
> >     >
> >     >             I will notice any new updates in this email.
> >     >
> >     >
> >     >             Thanks,
> >     >             Lixun
> >     >
> >     >
> >     >
> >     >
> >     >             On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <
> serg@xxxxxxxxxxx <mailto:serg@xxxxxxxxxxx>
> >     <mailto:serg@xxxxxxxxxxx <mailto:serg@xxxxxxxxxxx>>> wrote:
> >     >
> >     >                 Hi, Lixun!
> >     >
> >     >
> >     >                 On Jun 18, Lixun Peng wrote:
> >     >                 > Hi,
> >     >                 >
> >     >                 > As we know, InnoDB has implicit primary key if a
> table hasn't defined
> >     >                 > a primary key.  However MySQL server doesn't
> know this primary key, so
> >     >                 > this primary key will not apear in binlog.
> >     >                 >
> >     >                 > When we are using ROW format binlog for
> replication, if a table has no any
> >     >                 > indexes, that's a disaster. If a table without
> indexes do a DML
> >     >                 > (UPDATE/DELETE), of course it will run a long
> time in master, but in slave,
> >     >                 > it still need a long time. It will cause serious
> slave replication delay.
> >     >                 ...
> >     >
> >     >                 > I think mysql server can do the same thing as
> InnoDB do, if user doesn't
> >     >                 > define the primary key, mysql can add the
> primary key automatically.
> >     >                 >
> >     >                 > How do you think?
> >     >
> >     >                 Well, that's doable. A much easier solution would
> be to require a user
> >     >                 to create a primary key. It's a one-line change:
> >     >
> >     >                 -  Table_flags ha_table_flags() const { return
> cached_table_flags; }
> >     >                 +  Table_flags ha_table_flags() const { return
> cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
> >     >
> >     >                 But what you suggest is possible too, I believe.
> >     >
> >     >                 Regards,
> >     >                 Sergei
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-developers
> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-developers
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Senior Database Engineer @ Taobao.com

Mobile Phone: +86 18658156856 (Hangzhou)
Gtalk: penglixun(at)gmail.com
Twitter: http://www.twitter.com/plinux
Blog: http://www.penglixun.com

Attachment: implicit_primary_key.diff.4176
Description: Binary data


References