← Back to team overview

maria-developers team mailing list archive

Re: Review: [MDEV-7062] parser refactoring: don't store field properties in LEX


Hi Sergei,

On 11/16/2014 02:30 AM, Sergei Golubchik wrote:
Hi, Alexander!

On Nov 13, Alexander Barkov wrote:

[Commits] 2c904b9: parser cleanup: don't store field properties in LEX, use Create_field directly
serg at mariadb.org serg at mariadb.org
Mon Nov 10 11:26:48 EET 2014

     Previous message: [Commits] Rev 4344: MDEV-6179: dynamic columns functions/cast()/convert() doesn't play nice with CREATE/ALTER TABLE in lp:~maria-captains/maria/5.5
     Next message: [Commits] Rev 4345: MDEV-6862 "#error <my_config.h>" and third-party libraries in lp:~maria-captains/maria/5.5
     Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

revision-id: 2c904b937b5ef7fbec60ebb9cb514a501c5bd8e4
parent(s): f59588b9112965849986c45fcd150d9abed74063
committer: Sergei Golubchik
branch nick: maria
timestamp: 2014-11-09 22:33:17 +0100

parser cleanup: don't store field properties in LEX, use Create_field directly

length/dec/charset are still in LEX, because they're also used
for CAST and dynamic columns.

1. fix "MDEV-7041 COLLATION(CAST('a' AS CHAR BINARY)) returns a wrong result"
2. allow BINARY modifier in stored function RETURN clause
3. allow "COLLATION without CHARSET" in SP/SF (parameters, RETURN, DECLARE)
4. print correct variable name in error messages for stored routine parameters

A very nice clean up.

I only found it a little bit not easy to track that
all Lex and Create_field attributes get initialized and copied to each
other properly.

I suggest to go a little bit further and get rid from Lex->dec,
Lex->length and Lex->charset at all, and write the attributes directly
to the destination as soon as they get parsed. See a proposed addon
patch attached.

I don't like very much all that LEX::last_field thing, but it existed
already and I reused it. But I'd rather not introduce more
LEX::last_attr or LEX::last_whatever members unless absolutely
necessary. I generally prefer to pass data via the parser stack.

If we can move this to the %union in sql_yacc.yy, then it will be even better.

That's what I did in fixing this dyncol bug. As I've found in this
cleanup, and as you've seen in review, dynamic columns, CAST,
and field declarations all use LEX::length/etc - so using CAST or
dyncols in a virtual column will overwrite field's declaration.
Recently Kolbe found this as well and reported a bug for 5.5, MDEV-6179.
I've fixed it by creating LEX_TYPE structure - I didn't remove
length/etc from LEX in 5.5, but if this patch will be merged in 10.1, it
could easily allow to remove them from LEX.

So, I would rather postpone any further LEX changes until 10.0 merge.
Then I can show you the next cleanup patch. In fact, if I'd have checked
Jira bugs before doing this cleanup, I would've started from 5.5 bugfix
and then would've done the rest on top of it. But now I'm facing an
"interesting" merge ahead :)

Will wait for the next patch.

Can we take in account the following thing while working on this:

I think that to implement the data type plugin task properly,
Create_field should end up in a class hierarchy, similar to
what we have in Field.

The hierarchy will have one class per group of similar types,
grouped by having a particular data type attribute,
approximately like this:

|-Create_field_num             (for tinyint,smallint,int,bigint)
|-Create_field_num_with scale  (for float, double, decimal)
|-Create_field_str             (for char and varchar)
|-Create_field_with_scale      (for temporal data types)

Create_field will have a set of virtual methods:

bool Create_field::set_length(uint32 length);
bool Create_field::set_precision_and_scale(uint32 precision, uint32 scale);

and so on.

Create_field will also have:

Field *Create_field::make_field();

So all data specific attributes (like intervals for enum, packlen for blob, temporal precision, etc) will be hidden inside a particular
Field_xxx and its corresponding Create_field_xxx.

Field_xxx and it corresponding Create_field_xxx should share the same
structures to store the same attributes. And it would be nice to
encapsulate attributes into methods.

uint32 field->field_length() const;

instead of

uint32 field->field_length;

(LEX_TYPE in 5.5 is basically your Create_type_attributes_st without
m_length_was_set_explicitly - as it's not needed in 5.5 - and with
enum_field_types type).

diff --git a/mysql-test/r/cast.result b/mysql-test/r/cast.result
index c81af13..8ad4568 100644
--- a/mysql-test/r/cast.result
+++ b/mysql-test/r/cast.result
@@ -796,3 +796,32 @@ DATE("foo")
  Warning	1292	Incorrect datetime value: 'foo'
+create table t1 (a int, b char(5) as (cast("a" as char(10) binary) + a) );
+show create table t1;
+Table	Create Table
+t1	CREATE TABLE `t1` (
+  `a` int(11) DEFAULT NULL,
+  `b` char(5) AS (cast("a" as char(10) binary) + a) VIRTUAL
+drop table t1;
+select collation(cast("a" as char(10) binary));
+collation(cast("a" as char(10) binary))
+select collation(cast("a" as char(10) charset utf8 binary));
+collation(cast("a" as char(10) charset utf8 binary))
+select collation(cast("a" as char(10) ascii binary));
+collation(cast("a" as char(10) ascii binary))
+select collation(cast("a" as char(10) unicode binary));
+collation(cast("a" as char(10) unicode binary))
  > +ucs2_bin

UNICODE is an alias for UCS2, which is not always compiled.
Please move ucs2 related tests to ctype_ucs.test.


You removed this code in field.cc:

-  if (fld_length != NULL)
-  {
-    errno= 0;
-    length= strtoul(fld_length, NULL, 10);
-    if ((errno != 0) || (length > MAX_FIELD_BLOBLENGTH))
-    {
-    }
-    if (length == 0)
-      fld_length= NULL; /* purecov: inspected */
-  }

and added this in sql_yacc.yy:

+  if (length)
+  {
+    int err;
+    last_field->length= (ulong)my_strtoll10(length, NULL, &err);
+    if (err)
+      last_field->length= ~0UL; // safety
+  }
+  else
+    last_field->length= 0;

I have a feeling that on a 32bit platforms it will stop sending errors
on attempt to create a blob larger that 4Gb.
last_field->length is ulong, which is 32bit is a 32 bit platform,
and it gets assigned to a 64 bit value.

Indeed, good catch.
I'll fix it.

Okey to push, with the tests and with strtoul/my_strtoll10 problem fixed.