← Back to team overview

maria-discuss team mailing list archive

Dynamically rewrite query

 

Hello list,

I'm administering a MariaDB (10.0.23) backend used by some proprietary software. I'm having trouble with one specific slow query. The tables look like this (simplified for clarity):

CREATE TABLE `t1` (
  `i` int(11) NOT NULL,
  `j` int(11) NOT NULL,
  `l` int(11) NOT NULL,
  `m` int(11) DEFAULT NULL,
  PRIMARY KEY (`i`,`j`,`l`)
);

CREATE TABLE `t2` (
  `i` int(11) NOT NULL,
  `k` int(11) DEFAULT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The problematic query is this:
   SELECT DISTINCT t2.i, t2.k, t1.j, t1.l, t1.m FROM t2 LEFT JOIN t1 ON t2.i=t1.i WHERE t2.k < 123;


The query is slow because the DISTINCT clause forces the query to use a temporary table. Removing DISTINCT makes the query 100x faster.

Now, since (t1.i, t1.j, t1.l, t1.m) are guaranteed to be unique (and also (t2.i, t2.k) on their own, if the left join doesn't match), I cannot see how "SELECT ..." and "SELECT DISTINCT ..." can possibly produce different output. So I mention that to the developers of said software, and they reply that they forgot why they added the DISTINCT but that they decided it's too risky to remove it (thereby admitting that they don't comment their code, and don't write unit tests for bugs they find, but I knew that already).

I'm aware of "Garbage in, garbage out", but it could also be argued that MariaDB should be smart enough to detect that the DISTINCT is unnecessary and ignore it (should I create a bug report for that?).

In the meantime, is there any way I can rewrite queries like this dynamically to remove the DISTINCT clause? MySQL 5.7 documentation mentions query rewrite plugins, but I don't see that mentioned in MariaDB documentation. Any other suggestions are welcome.

Thanks,
Erik

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail


Follow ups