← Back to team overview

maria-developers team mailing list archive

Re: GROUP BY with non-grouped column returns NULL in some cases?

 

Sounds like a JIRA issue is in order here. I've created one:
https://mariadb.atlassian.net/browse/MDEV-5719.

On Fri, Feb 21, 2014 at 7:34 PM, Sergey Petrunia <sergey@xxxxxxxxxxx> wrote:
> Hi,
>
> On Thu, Feb 20, 2014 at 10:28:52AM -0800, Pavel Ivanov wrote:
>> I've noticed a difference in MariaDB's behavior compared to its
>> documentation and compared to MySQL 5.1 (didn't check with MySQL 5.5
>> or 5.6 yet), and I wanted to know what's your opinion -- is it a bug
>> or a conscious difference and documentation is just lagging.
>>
>> This doc https://mariadb.com/kb/en/select/#group-by says: "If you
>> select a non-grouped column or a value computed from a non-grouped
>> column, it is undefined which row the returned value is taken from". I
>> believe this implies that selected value must be returned from a row
>> group, not just something random.
>
> I would agree with this reasoning.
>
> MySQL 5.5, MariaDB 5.5, and MySQL 5.6 will all return (5,c,4,b).
>
> MariaDB 10.0 returns (5,c,NULL,NULL).
>
> I have investigated where the NULL values come from.  They come from the next
> value group.
>
> === TL;DR: investigation result ==
>
> Let's run this on 10.0 (I have added a WHERE to remove irrelevant rows):
>
> mysql> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON
> (t1.parent = t2.id and t2.name <> 'a') where t1.id > 4 GROUP BY t1.id;
> +----+------+------+------+
> | id | name | id   | name |
> +----+------+------+------+
> |  5 | c    | NULL | NULL |
> | 10 | d    | NULL | NULL |
> +----+------+------+------+
>
> The query has GROUP BY, which is not optimized away (by either MySQL or
> MariaDB). GROUP BY is satisified by using "range" access on t1, which produces
> records in the order where groups follow one another.
>
> First, end_send_group() sees the row (5,c,...). This is the first row, so it
> starts a group.
> Then, end_send_group() is invoked for row (10,d,...). It sees that the GROUP BY
> expression value has changed, and tries to print the first group. The execution is here:
>
>  #0  Field::is_null
>  #1  Protocol_text::store
>  #2  Item_field::send
>  #3  Protocol::send_result_set_row
>  #4  select_send::send_data
>  #5  end_send_group
>  #6  evaluate_null_complemented_join_record
>  #7  sub_select
>  #8  evaluate_join_record
>  #9  sub_select
>
> The field object at #0 points to the "result field" of table t2  (a TABLE
> object has "current row" fields and "result fields". AFAIU, the latter are
> kind of buffers that are used e.g. for stashing away GROUP BY columns).
>
> Unfortunately, Field::is_null() checks table->null_row, which is 1, because
> row combination (10,d, ...) has a NULL-complemented row for t2.  This is
> why we get NULLs instead of some value from (5,c,...) group.
>
> == Where does this come from ==
>
> It seems, MySQL 5.6 had this bug in the past. We've got the code line where
> the problem happens in
>
> revid: monty@xxxxxxxxxxxx-20130325220313-r7oxaau0x0r5ead1,
>   Temporary commit of 10.0-merge
>
>
> Since then, MySQL 5.6 has fixed it, but their related commits look scary (fix
> for another fix which doesn't fully fix it?):
>
> revid: sergey.glukhov@xxxxxxxxxx-20130619102408-0a0bf6sl5d7tsys
> Bug#16620047 INCORRECT QUERY RESULT (AFTER SERVER UPGRADE)
>   Regression is introduced in Bug11760517. According to the
>   comment for the Bug11760517 fix, there is an additional cleanup
>   in field.h(Field::is_null() function):
>   Field::is_null() now checks table->null_row before checking
>   the NULL bits.
>
>   Order of NULLity checking is changed, firts it checks table->null_row
>   and then null_ptr[row_offset]. In our case when we evaluate first
>   result record, evaluate_null_complemented_join_record() function
>   is called for outer table and it sets table->null_row to true.
>   Then second result record field uses table->null_row to check
>   if the field is null and result is NULL instead of real value.
>
>   The fix is partial reversion of the field.h change
>   (return original order of NULL check). Unfortunately this
>   is not sufficient. A lot of test fails in the main suite
>   after reversion. It happens because of another additional
>   cleanup introduced Bug11760517 fix(see the comment):
>      @ sql/table.h
>        mark_as_null_row() no longer sets the NULL bits of the buffer
>
>   Reversion of this change fixes all the problems except one, test case
>   for Bug13541761 which is actually another regression of Bug11760517 fix.
>   The fix of this problem is to not touch TABLE::null_flags during
>   JOIN::cleanup.
>
> BR
>  Sergei
> --
> Sergei Petrunia, Software Developer
> MariaDB | Skype: sergefp | Blog: http://s.petrunia.net/blog
>
>


References