← Back to team overview

maria-developers team mailing list archive

GROUP BY with non-grouped column returns NULL in some cases?

 

Hi,

I've noticed a difference in MariaDB's behavior compared to its
documentation and compared to MySQL 5.1 (didn't check with MySQL 5.5
or 5.6 yet), and I wanted to know what's your opinion -- is it a bug
or a conscious difference and documentation is just lagging.

This doc https://mariadb.com/kb/en/select/#group-by says: "If you
select a non-grouped column or a value computed from a non-grouped
column, it is undefined which row the returned value is taken from". I
believe this implies that selected value must be returned from a row
group, not just something random. But I have a scenario when NULL is
returned instead of real values. I understand that this is completely
undefined behavior in SQL standard, but MySQL and MariaDB seem to have
a documented support for this, that's why I'm asking.

Here's the mysql session log:

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.0.8-MariaDB-log Source distribution

Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> create database test;
Query OK, 1 row affected (0.00 sec)

MariaDB [(none)]> use test;
Database changed
MariaDB [test]> CREATE TABLE t (id bigint(20), name varchar(30),
parent bigint(20), primary key (id)) ENGINE=InnoDB;
Query OK, 0 rows affected (0.27 sec)

MariaDB [test]> INSERT INTO t values (1, 'a', NULL), (4, 'b', 1), (5,
'c', 4), (10, 'd', 1);
Query OK, 4 rows affected (0.04 sec)
Records: 4  Duplicates: 0  Warnings: 0

MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT
OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a') GROUP BY
t1.id;
+----+------+------+------+
| id | name | id   | name |
+----+------+------+------+
|  1 | a    | NULL | NULL |
|  4 | b    | NULL | NULL |
|  5 | c    | NULL | NULL |
| 10 | d    | NULL | NULL |
+----+------+------+------+
4 rows in set (0.00 sec)

MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT
OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a');
+----+------+------+------+
| id | name | id   | name |
+----+------+------+------+
|  1 | a    | NULL | NULL |
|  4 | b    | NULL | NULL |
|  5 | c    |    4 | b    |
| 10 | d    | NULL | NULL |
+----+------+------+------+
4 rows in set (0.00 sec)

MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT
OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a') GROUP BY
t1.id, t1.name;
+----+------+------+------+
| id | name | id   | name |
+----+------+------+------+
|  1 | a    | NULL | NULL |
|  4 | b    | NULL | NULL |
|  5 | c    |    4 | b    |
| 10 | d    | NULL | NULL |
+----+------+------+------+
4 rows in set (0.00 sec)

MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT
OUTER JOIN t t2 ON (t1.parent = t2.id) GROUP BY t1.id;
+----+------+------+------+
| id | name | id   | name |
+----+------+------+------+
|  1 | a    | NULL | NULL |
|  4 | b    |    1 | a    |
|  5 | c    |    4 | b    |
| 10 | d    |    1 | a    |
+----+------+------+------+
4 rows in set (0.00 sec)


As you see the first query returns NULLs instead of real values which
can be seen in second query without GROUP BY. And if I add another
column to GROUP BY or remove one condition from JOIN I can see real
values again. I've also tested with the table without PRIMARY KEY and
it also returned real values in the first query.

So is this a bug?


Thank you,
Pavel


Follow ups