← Back to team overview

maria-developers team mailing list archive

Re: A proposal to deprecate syntax: SELECT a'test'


Hi Monty,

On 02/25/2016 12:26 PM, Michael Widenius wrote:

   Hello Monty, all,

We discussed this with Sergei and both think that it will be a good idea
to deprecate this syntax:

SELECT a'test';

where 'a' is an identifier and 'test' its alias.

Is this dependent on if there is a space or not between a and 'test' ?

According to the standard, the space does not matter.
So I don't think we should make it depend on the space.

Note that in MySQL and MariaDB, ' and " are both used for quoting
strings and I don't think we can change that without affecting a lot
of applications!

I think it's ok to require one to have a space between a and 'test',
if that would help at all.

This is a non-standard way, and it conflicts with some other important
standard SQL grammar. See below.

Note that in MySQL/MariaDB, when not using ANSI mode, ' and " are
equal and ` is used for identifiers, so this is expected to be
non-standard. For quoting identifiers ` should be used.

Of course, if you are using ANSI mode, then things should be as close
to ANSI as possible.

Typified literals and a syntax conflict

An identifier followed by a single-quoted text string is needed
for typified literals:

SELECT INET6'ffff::ffff';

We already support this for temporal literals, which is a part of
the SQL standard:

SELECT TIME'10:10:10';
SELECT TIMESTAMP'2001-01-01 10:20:30';
SELECT DATE'2001-01-01';

Note, the conflict already exists in MySQL and MariaDB.

This script:

CREATE TABLE t1 (date date);
INSERT INTO t1 VALUES ('2016-02-19');
SELECT date'2001-01-01' FROM t1;


| date'2001-01-01' |
| 2001-01-01       |

Notice, it returns a DATE literal, it does not return the value of
the column t1.date using '2001-01-01' as an alias.

As ansi sql doesn't allow you to use date as a column name, this isn't
really as big problem as it may seam ;)

It does allow. This is a perfectly good standard SQL query:

SELECT a "Any text is Ok here" FROM t1;

So what's inside the quotes is not really important.

Anyway, I agree that this can be confusing, but so is having date as a
column namn.

Notice the difference:

SELECT date`2001-01-01` FROM t1;
| 2001-01-01 |
| 2016-02-19 |

This is confusing.

Not really, as you are using different quotes and you get the expected answer.
This should be same as if you used " in ANSI SQL, isn't it?

Yes, in the standard SQL, this query:

SELECT date'2001-01-01' FROM t1;

should return an equivalent for:

SELECT CAST('2001-01-01' AS DATE) FROM t1;

while this query:

SELECT date"2001-01-01" FROM t1;

should return:

SELECT t1.date AS "2001-01-01" FROM t1;

Shift-reduce conflicts
By the way, the fact that we support single-quoted string as an alias
is the reason for shift-reduce conflicts in sql_yacc.yy, because
TIME/DATE/TIMESTAMP followed by 'string' can be interpreted in two ways:
- a column name followed by an alias
- an SQL-standard time literal

What is the SQL-standard way to use an alias?
(Sorry, can't access the standard just now):

SELECT date "2001-01-01" FROM t1;
SELECT date '2001-01-01' FROM t1;

And if you execute both of these queries, will you not get the same
confusing result for one of them?

As ' and " are identical in MySQL mode, I don't see how you can fix
this without causing even more confusion.

I only propose to change the behavior for the SELECT expression aliases,
not to all strings generally.

What can be used instead
We have a number of other ways to specify aliases:

The standard ways:

SELECT a AS test;
SELECT a "test";
SELECT a AS "test";

Ok, this answers my question above. The standard says that an alias is
a text string.

Sorry, no. The standard says that an alias is an identifier.
Text strings are not valid aliases.

Here is the standard SQL grammar:

<derived column>    ::=   <value expression> [ <as clause> ]
<as clause>    ::=   [ AS ] <column name>
<column name>    ::=   <identifier>
<identifier>    ::=   <actual identifier>
<actual identifier> ::=  <regular identifier> | <delimited identifier>

where <regular identifier> is a "normal", not quoted identifier, and
<delimited identifier> is a double-quoted identifier.

In MySQL / MariaDB a text string can be specified with
both ' or ".
I don't think it's good idea to make alias a special case where you
have to specify
the text string only with ".

I only propose to deprecate string literals (i.e. single quoted strings)
as SELECT expression aliases, because this is a non-standard and
harmful  extension, which conflicts with another standard syntax assumed
for typified literals.

I don't propose to generally change the meaning of double-quoted
strings, so these examples will not change in any ways:

SELECT "xxx" FROM t1;

MySQL/MariaDB extensions:

SELECT a `test`;
SELECT a AS `test`;

That should be enough.

Note that this has nothing exclusively to do with alias, but with the
MySQL/MariaDB extension that you can use ' instead of " as a string

My proposal is exclusively about aliases, not about using of " as a
string delimiter.

1. In 10.1 we add a warning when a single-quoted string is used as a
column alias. Something like this should work:

Single quotes in a select sublist alias are deprecated. Use double
quotes instead.

This would break the specification of what a text strings is in

No, this will only change  what a SELECT expression alias is.

It would also be very confusing that text strings needs to be
specified differently in different places of the SQL syntax.

You're reordering causes and consequences :)

Again, this is not about text strings. It's about
what can be used as an alias.

I think it will be very obvious:

- Expressions support single-quoted and double-quoted strings

- Aliases support regular identifiers, delimited (i.e. double-quoted)
  identifiers and backtick-quoted identifiers.

Expressions and aliases are instances of very different nature.

Expression is a value that has a data type and attributes.
Alias is an identifier, nothing else.

There is nothing common in them at all.
It's quite obvious that they *can* have different syntactic rules.

2. In 10.2 we disallow this syntax by default and add either sql_mode or
old_mode to enable the old behavior.

As you can already get the behavior you want by specifying ANSI MODE,

Supporting typified literals in ANSI MODE only is not a way to go.
We should support them in all modes.

I don't think this change is necessary. I think it's also bad as it
makes it harder to define what a text string is.

This is not about to define what a text string is.
It's the other way around,
to define what are valid expressions and valid aliases.

For example:
In MariaDB, a text string can be delimited with ' or ", except in the
case of alias when you have to use '

Sorry, not sure that I understood. Can you clarify about aliases and '

I think it's better to require, as most ANSI SQL databases does, one
to quote column names like 'date' and 'time' to ensure they are not
mixed with operators or string prefixes.

I am afraid this is not really true.

Both date and time are perfectly valid regular identifiers.
Requiring to put them in quotes will be non-Standard.

I just checked this script:

INSERT INTO t1 VALUES('2001-01-01');
SELECT date FROM t1;

It works perfectly well in SQL Server, PostgreSQL and SQLite3.

It will be pity if we won't be able to support this syntax:

SELECT INET6'ffff::ffff';

unless we deprecate this useless extension that allows string
literals as aliases (i.e. single quoted string literals).

Note, PostgreSQL does support typified literals very well:

bar=> SELECT INET'00f::000f';

bar=> SELECT BIGINT'00001';


Follow ups