← Back to team overview

maria-developers team mailing list archive

Order by speed optimization


I was talking with a few people about a query such as

select * from table order by X, Y, Z limit G;
As the "ORDER BY" for me could change to any number of 12 values in any
order I am not able to use a multicolumn index as such I was looking at how
to optimize in such case where sorting is dynamic.

On my dataset the query takes ~7 seconds for a full table scan but only .3
seconds when using the column index and returns ~80,000 records for the
query (before the LIMIT)

I was wondering if there is any reason that the order by syntax doesn't use
the following logic, I was thinking of looking at the code but was told
such a simple optimization must have a reason that its not already in the
code, so I am asking here before I spend hours being stupid.

This logic block will be based on the above query.

   1. If index exists for column X and there is a limit on the query then
   use the index and sort the list, then return the first G values, continue
   past G until you hit a different value for X then the value that was at G
   2. Use this smaller subset of data for then ordering the data based on Y
   3. Order the data again based on Z
   4. Return the sorted data like normal

If there is a reason this optimization isn't in the code please someone let
me know, if not I think I will take a look at adding it.


Follow ups