← Back to team overview

maria-discuss team mailing list archive

Re: Index bug in aria engine?

 

Hi Sergei, thanks for the reply...As you see here:http://screencast.com/t/rjexoWKwFF
http://screencast.com/t/qYdsmRbRbmG7
I'm issuing 2 same queries, still unique and non-unique index if filtering different number of rows (and unique is always worse). I was unable to reproduce the results when more data was added to table using the previous queries so instead od thinking what was exact data set the last time i just modified both queries :)This time i changed the query a little, but still im issuing 2 same queries on 2 same tables. The only difference is that once the same index is non-unique and once it's unique.The point is, depending on the queries used sometimes the unique index is behaving much worse than non-unique and sometimes the unique one isn't used at all. And that's reproducable behaviour. The unique index is always inferior to normal one. Im getting the same result of EXPLAIN every time i run it.About the indexes... Don't I need raw_stats_lookup_id in the first place so the index on raw_stats_other_copyiX_copy can be used in JOIN. And in spite of id being high cardinality field shouldn't it be put there first? The assumption is that this table will be always used in JOIN anyway.
I need as much flexible index as possible because tables will be very large and i don't want to introduce multiple indexes.http://screencast.com/t/nR8Zavjky
http://screencast.com/t/OjyvwyY8
>So, in all, the additional knowledge you mention is not useful for the optimizerWell, im not worried that the additional knowledge isn't used ... but that these unique indexes are (with additional knowledge) behaving much worse than NORMAL indexes. That are, in addition, much smaller. And i don't want to just have 2 same indexes, just one that is not unique.S.
Dnia 7 marca 2014 1:41 Pantelis Theodosiou <ypercube@xxxxxxxxx> napisał(a):
Hello pslawek83,  as Sergei has already told you, the issue is not the uniqueness of the indexes. Some notes after reading your queries and the tables: - I see two versions of the query, one having `attrib=5`, the other without.- on table raw_stats_lookup, you do not have an index on (domain, id) or even just on (domain).- on table raw_stats_other_copy, the 4-column index is not the best for this query. The date (and perhaps attrib) should be promoted at the first positions of the index. If you also include `impressions` on the index, the query will not have to read the table, all the data will be available from the index.So, an index on either (date, attrib, raw_stats_value_id, raw_stats_lookup_id, impressions)  or on (date, raw_stats_lookup_id, attrib, raw_stats_value_id, impressions)  would be much better than what you have now. So, in all, the additional knowledge you mention is not useful for the optimizer, as the index is not really appropriate for this query.  Pantelis 
On Thu, Mar 6, 2014 at 8:06 PM, pslawek83 <pslawek83@xxxxx> wrote:
Ok i wasn't able to reproduce the issue fully, now data is different ... but again, "skipped rows" number changes, and index sizes differ a lot:
Uniq index:
http://screencast.com/t/SrBS1L5xeB8
http://screencast.com/t/3dVMU7b69AH2
 Normal index:
http://screencast.com/t/Zl5Jm2OZ
http://screencast.com/t/P3SA8U6duly
But again, unique index is working much worse than not unique... while it's bigger
http://screencast.com/t/rjexoWKwFF
http://screencast.com/t/qYdsmRbRbmG7
I'll quickly explain how tables were created... i first took part of the data with unique index, and made smaller copy, thats table 1
Then i just copied the table changed the index to non-unique.
After this i copied the tables once again to be sure files are not fragmented, etc.
SHOW WARNINGS returns nothing in both cases. It seems that the additional knowledge (only single row with same data) should make the optimizer and query behave better but it's actually behaving much worse...
--
Query for reference:
EXPLAIN extended SELECT attrib_id, raw_stats_value_id, sum(impressions)FROM raw_stats_other_copyi1_copy INNER JOIN raw_stats_lookupON  raw_stats_lookup_id =  raw_stats_lookup.idWHERE domain = 'mydomain.com'AND date = '2014-02-10'GROUP BY attrib_id, raw_stats_value_id;
--
Thanks,
Slawomir.
Dnia 6 marca 2014 17:46 Sergei Golubchik <serg@xxxxxxxxxxx> napisał(a):
Hi, pslawek83!
On Feb 17, pslawek83 wrote:
> Hi Guys,
> any comments on this issue? It seems that partial unique indexes can't be used in joins.
> https://mariadb.atlassian.net/browse/MDEV-5663
Please, show EXPLAIN EXTENDED (and SHOW WARNINGS after it) for this
query of yours.
> * Why the server is treating same indexes differently depends on if
> they're UNIQUE or not
It doesn't, there're almost no differences between UNIQUE and non-unique
indexes from the optimizer point of view. Optimizer knows that for a
unique index there can be at most one matching row (if all index parts
are known and are NOT NULL), for non-unique indexes there can be more.
That's basically the only difference.
> * That's probably not optimizer issue, as we can't FORCE the index
You cannot force the index if it is not applicable at all.
> * What's internal difference between unique and non-unique index (eg.
> memory / file representation / data structure)
none.
> * What each index type is suitable for, considering query optimization
> (as there's no data i was able to find on topic)
Aria only supports BTREE indexes anyway.
Regards,
Sergei
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp

References