maria-developers team mailing list archive

Thread
Date

Re: Fwd: [Commits] Rev 3616: Fix for bug in file:///home/tsk/mprog/src/5.3-md641/

To: Timour Katchaounov <timour@xxxxxxxxxxxxxxxx>
From: Igor Babaev <igor@xxxxxxxxxxxx>
Date: Sun, 27 Jan 2013 22:58:51 -0800
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <51014780.6080800@montyprogram.com>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101026 SUSE/3.0.10 Thunderbird/3.0.10

Timour,

It's not clear for me:

1. why group by queries with GROUP BY col1, col2, where col1,col2 are
defined over the same table, are considered as possibly good for
for loose scan optimization.

2. how it comes that unique index is considered as a candidate for
a loose scan (see your test case).

(The second issue, of course, are unrelated to bug mdev-641).

Besides,  didn't you notice that you removes duplicates in group by
under any circumstances.

How it will affect the following query:
select a, sum(c) from t1 group by a, a with rollup; ?

In general, removing duplicates from GROUP BY and ORDER BY lists is a
good optimization, but why it should be done in connection with
this bug?

Regards,
Igor.


On 01/24/2013 06:38 AM, Timour Katchaounov wrote:
> Igor,
> 
> Could you please review the following patch (we discussed it at the team
> call):
> 
> ------------------------------------------------------------
> revno: 3616
> revision-id: timour@xxxxxxxxxxxx-20130124143526-bpaiyushble1ieek
> parent: timour@xxxxxxxxxxxx-20130117140805-4kyoq7azx4v2irhq
> fixes bug: https://mariadb.atlassian.net/browse/MDEV-614
> committer: timour@xxxxxxxxxxxx
> branch nick: 5.3-md641
> timestamp: Thu 2013-01-24 16:35:26 +0200
> message:
>   Fix for bug
>   MDEV-641 LP:1002108 - Wrong result (or crash) from a query with
> duplicated field in the group list and a limit clause
>   Bug#11761078: 53534: INCORRECT 'SELECT SQL_BIG_RESULT...'
>                 WITH GROUP BY ON DUPLICATED FIELDS
> 
>   Analysis
>   Range analysis discoveres that the query can be executed via loose
> index scan for GROUP BY.
>   Later, GROUP BY analysis fails to confirm that the GROUP operation can
> be computed via an
>   index because there is no logic to handle duplicate field references
> in the GROUP clause.
>   As a result the optimizer the optimizer produces an inconsistent plan.
> It constructs a
>   temporary table, but on the other hand the group fields are not set to
> point there.
> 
>   Solution:
>   Make order by analysis be in sync with loose scan analysis for group
> by. Instead of
>   changing the logic of how indexes are analyzed for order by, the patch
> takes the same
>   approach as MySQL - normalize the query by removing duplicate column
> references. The
>   difference from MySQL is that we remove duplicates earlier, by reusing
> remove_const()
>   which already removed unneeded ORDER elements.
> 
>