← Back to team overview

maria-developers team mailing list archive

Re: GROUP BY with non-grouped column returns NULL in some cases?

 

Hi,

On Thu, Feb 20, 2014 at 10:28:52AM -0800, Pavel Ivanov wrote:
> I've noticed a difference in MariaDB's behavior compared to its
> documentation and compared to MySQL 5.1 (didn't check with MySQL 5.5
> or 5.6 yet), and I wanted to know what's your opinion -- is it a bug
> or a conscious difference and documentation is just lagging.
> 
> This doc https://mariadb.com/kb/en/select/#group-by says: "If you
> select a non-grouped column or a value computed from a non-grouped
> column, it is undefined which row the returned value is taken from". I
> believe this implies that selected value must be returned from a row
> group, not just something random. 

I would agree with this reasoning.

MySQL 5.5, MariaDB 5.5, and MySQL 5.6 will all return (5,c,4,b).

MariaDB 10.0 returns (5,c,NULL,NULL). 

I have investigated where the NULL values come from.  They come from the next
value group. 

=== TL;DR: investigation result ==

Let's run this on 10.0 (I have added a WHERE to remove irrelevant rows):

mysql> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON
(t1.parent = t2.id and t2.name <> 'a') where t1.id > 4 GROUP BY t1.id;
+----+------+------+------+
| id | name | id   | name |
+----+------+------+------+
|  5 | c    | NULL | NULL |
| 10 | d    | NULL | NULL |
+----+------+------+------+

The query has GROUP BY, which is not optimized away (by either MySQL or
MariaDB). GROUP BY is satisified by using "range" access on t1, which produces
records in the order where groups follow one another.

First, end_send_group() sees the row (5,c,...). This is the first row, so it
starts a group.
Then, end_send_group() is invoked for row (10,d,...). It sees that the GROUP BY
expression value has changed, and tries to print the first group. The execution is here:

 #0  Field::is_null
 #1  Protocol_text::store 
 #2  Item_field::send 
 #3  Protocol::send_result_set_row 
 #4  select_send::send_data 
 #5  end_send_group 
 #6  evaluate_null_complemented_join_record 
 #7  sub_select 
 #8  evaluate_join_record 
 #9  sub_select 

The field object at #0 points to the "result field" of table t2  (a TABLE 
object has "current row" fields and "result fields". AFAIU, the latter are 
kind of buffers that are used e.g. for stashing away GROUP BY columns).

Unfortunately, Field::is_null() checks table->null_row, which is 1, because
row combination (10,d, ...) has a NULL-complemented row for t2.  This is 
why we get NULLs instead of some value from (5,c,...) group.

== Where does this come from ==

It seems, MySQL 5.6 had this bug in the past. We've got the code line where
the problem happens in 

revid: monty@xxxxxxxxxxxx-20130325220313-r7oxaau0x0r5ead1, 
  Temporary commit of 10.0-merge


Since then, MySQL 5.6 has fixed it, but their related commits look scary (fix
for another fix which doesn't fully fix it?): 

revid: sergey.glukhov@xxxxxxxxxx-20130619102408-0a0bf6sl5d7tsys
Bug#16620047 INCORRECT QUERY RESULT (AFTER SERVER UPGRADE)
  Regression is introduced in Bug11760517. According to the
  comment for the Bug11760517 fix, there is an additional cleanup
  in field.h(Field::is_null() function):
  Field::is_null() now checks table->null_row before checking
  the NULL bits.

  Order of NULLity checking is changed, firts it checks table->null_row
  and then null_ptr[row_offset]. In our case when we evaluate first
  result record, evaluate_null_complemented_join_record() function
  is called for outer table and it sets table->null_row to true.
  Then second result record field uses table->null_row to check
  if the field is null and result is NULL instead of real value.

  The fix is partial reversion of the field.h change
  (return original order of NULL check). Unfortunately this
  is not sufficient. A lot of test fails in the main suite
  after reversion. It happens because of another additional
  cleanup introduced Bug11760517 fix(see the comment):
     @ sql/table.h
       mark_as_null_row() no longer sets the NULL bits of the buffer
        
  Reversion of this change fixes all the problems except one, test case
  for Bug13541761 which is actually another regression of Bug11760517 fix.
  The fix of this problem is to not touch TABLE::null_flags during
  JOIN::cleanup.

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB | Skype: sergefp | Blog: http://s.petrunia.net/blog




Follow ups

References