← Back to team overview

maria-developers team mailing list archive

Re: 3f38da9145c: MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN

 

Hi Sergei,

Thank you for your review!


I've fixed some of your suggestions.
For other suggestions I have questions and comments. See below.

A new patch is here:

https://github.com/MariaDB/server/commit/62d9ca85706bc0c78ca174703ca46e2e82002491



On 9/16/19 10:33 PM, Sergei Golubchik wrote:
Hi, Alexander!

On Sep 16, Alexander Barkov wrote:
revision-id: 3f38da9145c (mariadb-10.4.4-251-g3f38da9145c)
parent(s): e6ff3f9d1c8
author: Alexander Barkov <bar@xxxxxxxxxxx>
committer: Alexander Barkov <bar@xxxxxxxxxxx>
timestamp: 2019-07-12 07:53:55 +0400
message:

MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN

diff --git a/include/mysql/plugin.h b/include/mysql/plugin.h
index 85e52a247af..92703b626ac 100644
--- a/include/mysql/plugin.h
+++ b/include/mysql/plugin.h
@@ -611,6 +612,22 @@ struct handlerton;
     int interface_version;
   };
+/*
+  API for data type plugin. (MariaDB_DATA_TYPE_PLUGIN)
+*/
+#define MariaDB_DATA_TYPE_INTERFACE_VERSION (MYSQL_VERSION_ID << 8)
+
+/**
+   Data type plugin descriptor
+*/
+#ifdef __cplusplus
+struct st_mariadb_data_type
+{
+  int interface_version;
+  const class Type_handler *type_handler;
+};
+#endif

Plugin-wise it'd be better to have a separate plugin_data_type.h
and put Type_handler definition there.

So that as much of the API as possible gets into the .pp
file and we could detect when it changes.

For now I defined interface version as:

#define MariaDB_DATA_TYPE_INTERFACE_VERSION (MYSQL_VERSION_ID << 8)

So, if I understand correctly, every distinct MySQL server version
will require data type plugins compiled exactly for this version.


Note, sql_type.h includes the following headers:

#include "mysqld.h"
#include "sql_array.h"
#include "sql_const.h"
#include "sql_time.h"
#include "sql_type_real.h"
#include "compat56.h"
C_MODE_START
#include <ma_dyncol.h>
C_MODE_END

and forward-defines around 60 classes and structures, e.g:

class Field;
class Column_definition;
class Column_definition_attributes;
class Key_part_spec;
class Item;
...


Does it also mean that all these 60 classes/structures should be a part of ABI?

Can these changes in sql_type.h and plugin_data_type.h be done under a
separate MDEV later?



+
  /*************************************************************************
    st_mysql_value struct for reading values from mysqld.
    Used by server variables framework to parse user-provided values.
diff --git a/plugin/type_test/CMakeLists.txt b/plugin/type_test/CMakeLists.txt
new file mode 100644
index 00000000000..b85168d1bd2
--- /dev/null
+++ b/plugin/type_test/CMakeLists.txt
@@ -0,0 +1,17 @@
+# Copyright (c) 2016, MariaDB corporation. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1335  USA
+
+MYSQL_ADD_PLUGIN(type_test plugin.cc RECOMPILE_FOR_EMBEDDED
+                 MODULE_ONLY COMPONENT Test)

Add at least two new plugins, please.
May be in separate commits, as you like.
But at least two.
And practical, not just to test the API


I'll add INET6 as a full-featured plugin when MDEV-20016 is in.
It will test all aspects of a data type.

The purpose of the current task is rather simple:
- to introduce new plugin type
- to make sure that data type plugins
  can write/read themselves to FRM file.


diff --git a/plugin/type_test/mysql-test/type_test/type_test_int8-debug.test b/plugin/type_test/mysql-test/type_test/type_test_int8-debug.test
new file mode 100644
index 00000000000..30e313481d6
--- /dev/null
+++ b/plugin/type_test/mysql-test/type_test/type_test_int8-debug.test
@@ -0,0 +1,11 @@
+--echo #
+--echo # MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN
+--echo #
+
+SET SESSION debug_dbug="+d,frm_data_type_info";
+
+CREATE TABLE t1 (a TEST_INT8);
+SHOW CREATE TABLE t1;
+DROP TABLE t1;
+
+SET SESSION debug_dbug="-d,frm_data_type_info";

don't do that, always save old debug_dbug value instead and restore it later.

  SET @old_debug_dbug=@@debug_dbug;
  SET @@debug_dbug="+d,frm_data_type_info";
  ...
  SET @@debug_dbug=@old_debug_dbug;

because after "+d,xxx" you have one keyword enabled, like in
@@debug_dbug="d,xxx".  and after "-d,xxx" you have no keywords enabled,
like in @debug_dbug="d" which means "all keywords enabled" for dbug.

Fixed.


diff --git a/plugin/type_test/mysql-test/type_test/type_test_int8.result b/plugin/type_test/mysql-test/type_test/type_test_int8.result
new file mode 100644
index 00000000000..758f94904c1
--- /dev/null
+++ b/plugin/type_test/mysql-test/type_test/type_test_int8.result
@@ -0,0 +1,144 @@
+#
+# MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN
+#
+SELECT
+PLUGIN_NAME,
+PLUGIN_VERSION,
+PLUGIN_STATUS,
+PLUGIN_TYPE,
+PLUGIN_AUTHOR,
+PLUGIN_DESCRIPTION,
+PLUGIN_LICENSE,
+PLUGIN_MATURITY,
+PLUGIN_AUTH_VERSION
+FROM INFORMATION_SCHEMA.PLUGINS
+WHERE PLUGIN_TYPE='DATA TYPE'
+    AND PLUGIN_NAME='TEST_INT8';
+PLUGIN_NAME	TEST_INT8
+PLUGIN_VERSION	1.0
+PLUGIN_STATUS	ACTIVE
+PLUGIN_TYPE	DATA TYPE
+PLUGIN_AUTHOR	MariaDB
+PLUGIN_DESCRIPTION	Data type TEST_INT8
+PLUGIN_LICENSE	GPL
+PLUGIN_MATURITY	Alpha
+PLUGIN_AUTH_VERSION	1.0
+CREATE TABLE t1 (a TEST_INT8);
+SHOW CREATE TABLE t1;
+Table	Create Table
+t1	CREATE TABLE `t1` (
+  `a` test_int8 DEFAULT NULL

is the type name a reserved word? or an arbitrary identifier?
should we support (and print as)

     `a` `test_int8` DEFAULT NULL

?

For now the grammar understands identifiers,
but only those that are not keyword
(neither reserved, nor non-reserved).

Perhaps most of non-reserved keywords could be allowed as well
without grammar conflicts. We can do it later.


I'm not sure if we should add backticks.
I'd prefer to limit the data type name not to have
spaces or other special characters, so quoting is not required.



+) ENGINE=MyISAM DEFAULT CHARSET=latin1
+DROP TABLE t1;
+SELECT CAST('100' AS TEST_INT8) AS cast;
+cast
+100
+BEGIN NOT ATOMIC
+DECLARE a TEST_INT8 DEFAULT 256;
+SELECT HEX(a), a;
+END;
+$$
+HEX(a)	a
+100	256
+CREATE FUNCTION f1(p TEST_INT8) RETURNS TEST_INT8 RETURN 1;
+SHOW CREATE FUNCTION f1;
+Function	sql_mode	Create Function	character_set_client	collation_connection	Database Collation
+f1	STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION	CREATE DEFINER=`root`@`localhost` FUNCTION `f1`(p TEST_INT8) RETURNS test_int8
+RETURN 1	latin1	latin1_swedish_ci	latin1_swedish_ci

use enable_metadata for a part (or for a whole) of this test

Done.


and show how mysql --column-type-info handles it.

This will be added when the protocol changes are in.


<cut>
>
CREATE LIKE, ALTER TABLE, SHOW COLUMNS, I_S.COLUMNS

Does it support indexing?

I've added brief tests for all these features, including indexing.

The patch adding INET6 will test them more thoroughly.



diff --git a/plugin/type_test/plugin.cc b/plugin/type_test/plugin.cc
new file mode 100644
index 00000000000..ea70c70f786
--- /dev/null
+++ b/plugin/type_test/plugin.cc
@@ -0,0 +1,120 @@
+/*
+   Copyright (c) 2000, 2015, Oracle and/or its affiliates.
+   Copyright (c) 2009, 2019, MariaDB
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; version 2 of the License.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA */
+
+#include <my_global.h>
+#include <sql_class.h>          // THD
+#include <mysql/plugin.h>
+#include "sql_type.h"
+
+
+class Field_test_int8 :public Field_longlong
+{
+public:
+  Field_test_int8(const LEX_CSTRING &name, const Record_addr &addr,
+                  enum utype unireg_check_arg,
+                  uint32 len_arg, bool zero_arg, bool unsigned_arg)
+    :Field_longlong(addr.ptr(), len_arg, addr.null_ptr(), addr.null_bit(),
+                    Field::NONE, &name, zero_arg, unsigned_arg)
+  {}
+  void sql_type(String &res) const
+  {
+    CHARSET_INFO *cs= res.charset();
+    res.length(cs->cset->snprintf(cs,(char*) res.ptr(),res.alloced_length(),
+               "test_int8"));

isn't there an easier way of setting a String to a specific value?
Like, res.copy() or something?

I've changed this to:

  res.set_ascii(STRING_WITH_LEN("test_int8"));

like other Field_xxx::sql_type do in case
when there are no any attributes.



By the way, why do you need to do it at all, if the parent's Field::sql_type
method could set the value from type_handler()->name() ?

Field_test_int8 derives from Field_longlong,
so it has to override the method anyway.

Using type_handler()->name() would involve and extra virtual
call which can be avoided in this particular case.


I can do something like:

- add this method to String:
 bool set_ascii(const LEX_CSTRING &str)
 {
   return set_ascii(str.str, str.length);
 }


- Define sql_type() as:

void Field_test_int8::sql_type(String &res) const
{
  return res.set_ascii(type_handler_test_int8.name().lex_cstring());
}


This will reuse the name from Type_handler_int8,
and will avoid an unnecessary virtual call.



+    // UNSIGNED and ZEROFILL flags are not supported by the parser yet.
+    // add_zerofill_and_unsigned(res);
+  }
+  const Type_handler *type_handler() const;
+};
+
+
+class Type_handler_test_int8: public Type_handler_longlong
+{
+public:
+  const Name name() const override
+  {
+    static Name name(STRING_WITH_LEN("test_int8"));

I'd prefer this being picked up automatically from the plugin name.
Like it's done for engines, I_S tables, auth plugins, ft parsers, etc.

Plugin names use upper case:

locale_info/locale_info.cc: "LOCALES", type_geom/plugin.cc: "SPATIAL_REF_SYS",
type_geom/plugin.cc:  "GEOMETRY_COLUMNS",
type_test/plugin.cc:  "TEST_INT8",
wsrep_info/plugin.cc:  "WSREP_MEMBERSHIP",
wsrep_info/plugin.cc:  "WSREP_STATUS",

All data types use lower case.

What do you propose?


+    return name;
+  }
+  bool Column_definition_data_type_info_image(Binary_string *to,
+                                              const Column_definition &def)
+                                              const override
+  {
+    return to->append(Type_handler_test_int8::name().lex_cstring());
+  }

Not sure, why this has to be overridden by the derived class.

The parent class Type_handler_longlong does not need to write
its name into FRM, as it's identified by the type code, like
all other built-in data types.



+  Field *make_table_field(MEM_ROOT *root,
+                          const LEX_CSTRING *name,
+                          const Record_addr &addr,
+                          const Type_all_attributes &attr,
+                          TABLE *table) const override
+  {
+    return new (root)
+           Field_test_int8(*name, addr, Field::NONE,
+                           attr.max_char_length(),
+                           0/*zerofill*/,
+                           attr.unsigned_flag);
+  }
+
+  Field *make_table_field_from_def(TABLE_SHARE *share, MEM_ROOT *root,
+                                   const LEX_CSTRING *name,
+                                   const Record_addr &rec, const Bit_addr &bit,
+                                   const Column_definition_attributes *attr,
+                                   uint32 flags) const override
+  {
+    return new (root)
+      Field_test_int8(*name, rec, attr->unireg_check,
+                      (uint32) attr->length,
+                      f_is_zerofill(attr->pack_flag) != 0,
+                      f_is_dec(attr->pack_flag) == 0);
+  }
+};

Why do you need two different methods? I'd expect one
that does new Field_test_int8() to be enough.

The second can be in the parent class, calling make_table_field().
Or both could be in the parent class, calling the only
virtual method that actually does new (root) Field_test_int8(...)

Historically we had two kinds of functions
(have a look into 10.0):

- To create from FRM data:

Field *make_field(TABLE_SHARE *share, uchar *ptr, uint32 field_length,
                  uchar *null_pos, uchar null_bit,
                  uint pack_flag,
                  enum_field_types field_type,
                  CHARSET_INFO *field_charset,
                  Field::geometry_type geom_type,
                  Field::utype unireg_check,
                  TYPELIB *interval,
                  const char *field_name)

- To create from Item:
create_tmp_field_from_item()
Field_new_decimal::create_from_item(item)

They use different input data formats for attributes.

These two interfaces have migrated to:

- make_table_field_from_def(), to create from FRM.
- make_table_field(), to create from Item.

I plan to add a new method in Item so it can return attributes
in the format suitable for make_table_field_from_def().

I planned to do it at some point later, in a separate patch.



+
+static Type_handler_test_int8 type_handler_test_int8;
+
+
+const Type_handler *Field_test_int8::type_handler() const
+{
+  return &type_handler_test_int8;
+}
+
+
+/*************************************************************************/
+
+static struct st_mariadb_data_type data_type_test_plugin=
+{
+  MariaDB_DATA_TYPE_INTERFACE_VERSION,
+  &type_handler_test_int8
+};

It's be more interesting to have a distinct type, not just an alias
for BIGINT. E.g. 7-byte integer.

As an example, it's a pretty empty plugin, it doesn't show
why this API was ever created. Add something non-trivial to it please.

And some real, non-test, plugin in a separate commit.

INET6 will fully demonstrate the whole API.


This patch demonstrates that the data type goes through this chain:

Parser -> write FRM -> open FRM



+
+
+maria_declare_plugin(type_geom)
+{
+  MariaDB_DATA_TYPE_PLUGIN,     // the plugin type (see include/mysql/plugin.h)
+  &data_type_test_plugin,       // pointer to type-specific plugin descriptor
+  "TEST_INT8",                  // plugin name
+  "MariaDB",                    // plugin author

MariaDB Corporation ?

+  "Data type TEST_INT8",        // the plugin description
+  PLUGIN_LICENSE_GPL,           // the plugin license (see include/mysql/plugin.h)
+  0,                            // Pointer to plugin initialization function
+  0,                            // Pointer to plugin deinitialization function
+  0x0100,                       // Numeric version 0xAABB means AA.BB veriosn
+  NULL,                         // Status variables
+  NULL,                         // System variables
+  "1.0",                        // String version representation
+  MariaDB_PLUGIN_MATURITY_ALPHA // Maturity (see include/mysql/plugin.h)*/

EXPERIMENTAL (for test plugins. ALPHA kind of implies it'll be BETA, GAMMA
and STABLE eventually)

Fixed.


+}
+maria_declare_plugin_end;
diff --git a/sql/sql_type.cc b/sql/sql_type.cc
index 5cecd9f50f7..988619cb5f9 100644
--- a/sql/sql_type.cc
+++ b/sql/sql_type.cc
@@ -121,8 +121,23 @@ bool Type_handler_data::init()
const Type_handler *
-Type_handler::handler_by_name(const LEX_CSTRING &name)
+Type_handler::handler_by_name(THD *thd, const LEX_CSTRING &name)
  {
+  plugin_ref plugin;
+  if ((plugin= my_plugin_lock_by_name(thd, &name, MariaDB_DATA_TYPE_PLUGIN)))
+  {
+    /*
+      INSTALL PLUGIN is not fully supported for data type plugins yet.

Why? What's not supported?

UNINSTALL PLUGIN does not check that the data type handler
is currently being used by parallel threads.
So for now some limitation is assumed, as in this comment:


+      Fow now we have only mandatory built-in plugins
+      and dynamic plugins for test purposes,
+      Should be safe to unlock the plugin immediately.
+    */
+    const Type_handler *ph= reinterpret_cast<st_mariadb_data_type*>
+                              (plugin_decl(plugin)->info)->type_handler;
+    plugin_unlock(thd, plugin);
+    return ph;
+  }
+
  #ifdef HAVE_SPATIAL
    const Type_handler *ha= type_collection_geometry.handler_by_name(name);
    if (ha)
diff --git a/sql/sql_type.h b/sql/sql_type.h
index 8f2d4d0c49d..b01330f30e4 100644
--- a/sql/sql_type.h
+++ b/sql/sql_type.h
@@ -3221,9 +3221,9 @@ class Information_schema_character_attributes
  class Type_handler
  {
  protected:
-  static const Name m_version_default;
-  static const Name m_version_mysql56;
-  static const Name m_version_mariadb53;
+  static const MYSQL_PLUGIN_IMPORT Name m_version_default;
+  static const MYSQL_PLUGIN_IMPORT Name m_version_mysql56;
+  static const MYSQL_PLUGIN_IMPORT Name m_version_mariadb53;


I fixed this in 10.5, so MYSQL_PLUGIN_IMPORT is not needed any more:

commit 74551b2b6fc6a3db367d012eb4a31453f8215d6f
commit 9b1866fd847390a45ad977391f8b2e2c8dab9f16


Why do you need that? Parent's behavior should be always fine for plugins.
I don't think plugins should know about this at all.

    String *print_item_value_csstr(THD *thd, Item *item, String *str) const;
    String *print_item_value_temporal(THD *thd, Item *item, String *str,
                                       const Name &type_name, String *buf) const;
@@ -5129,9 +5130,9 @@ class Type_handler_bool: public Type_handler_long
class Type_handler_longlong: public Type_handler_general_purpose_int
  {
-  static const Name m_name_longlong;
-  static const Type_limits_int m_limits_sint64;
-  static const Type_limits_int m_limits_uint64;
+  static const MYSQL_PLUGIN_IMPORT Name m_name_longlong;
+  static const MYSQL_PLUGIN_IMPORT Type_limits_int m_limits_sint64;
+  static const MYSQL_PLUGIN_IMPORT Type_limits_int m_limits_uint64;

same here

  public:
    virtual ~Type_handler_longlong() {}
    const Name name() const { return m_name_longlong; }
diff --git a/sql/table.cc b/sql/table.cc
index ea333cb2ecd..70b3814df6c 100644
--- a/sql/table.cc
+++ b/sql/table.cc
@@ -2291,12 +2291,20 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
if (field_data_type_info_array.count())
          {
+          const LEX_CSTRING &info= field_data_type_info_array.
+                                     element(i).type_info();
            DBUG_EXECUTE_IF("frm_data_type_info",
              push_warning_printf(thd, Sql_condition::WARN_LEVEL_NOTE,
              ER_UNKNOWN_ERROR, "DBUG: [%u] name='%s' type_info='%.*s'",
              i, share->fieldnames.type_names[i],
-            (uint) field_data_type_info_array.element(i).type_info().length,
-            field_data_type_info_array.element(i).type_info().str););
+            (uint) info.length, info.str););
+
+          if (info.length)
+          {
+            const Type_handler *h= Type_handler::handler_by_name(thd, info);
+            if (h)
+              handler= h;
+          }

I don't see where you handle the error that "unknown type"


Fixed to return an error. Added a test.



          }
        }


Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx



Follow ups

References