← Back to team overview

drizzle-discuss team mailing list archive

Re: Delayed join improvement

 

I haven't tried it in drizzle yet, these results are from mysql. I point it out to you guys because it could be I assume they'll come out similarly. I'll try to get a full dump into drizzle in the next few days.


Slow:


mysql> EXPLAIN SELECT * FROM user_table WHERE uid < 1000000 ORDER BY id DESC LIMIT 0,12\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: user_table
         type: range
possible_keys: uid
          key: uid
      key_len: 8
          ref: NULL
         rows: 48988
        Extra: Using where; Using filesort
1 row in set (0.00 sec)




Fast:

mysql> EXPLAIN SELECT * FROM user_table INNER JOIN (        SELECT id FROM user_table WHERE uid < 1000000 ORDER BY id DESC LIMIT 0,12 ) as SUBSELECT_TABLE ON user_table.id = SUBSELECT_TABLE.id\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 12
        Extra: 
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: user_table
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: SUBSELECT_TABLE.id
         rows: 1
        Extra: 
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: user_table
         type: index
possible_keys: uid
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 48985
        Extra: 
3 rows in set (0.04 sec)




Aggregation functions are especially ripe for optimization. Here's another (non-limit) ugly query a developer had written which did very well when changing SELECT * to SELECT id + inner join:



Slow:


mysql> EXPLAIN SELECT `tbl`.* FROM (SELECT `user_table`.* FROM `user_table` WHERE (uid IN (1022220528,516975293,610868834,1369254286,678430330,760683910,1101602962,840005,1539880648,536801853,533871879,1505692116,724332436,31803759,506549838,1507890014,593486534,1052379946,503592215,194600927,167800172,22406819,9433878,2716675,1120200072,517782135,729065959,620450950,32800651,672604205,588236068,33606106,1404468345,12633743,200800506,1413614990,585172327,1162999061,1066765095,1028653783,544385233,662588027,1137254943,1267449968,1161740972,760405590,1545259569,691569675,615948798,1189330419,595498800,546053510,1563824045,1160480409,162801352,1232341358,829698629,1486335885,1058605840,1039020025,701572979,513147932,614621086,1292596475,731515619,1058231753,1120200213,646377270,71701242,502763017,1173077531,710880495,85600035,193302738,11603373,1094790282,501483798,733680612,1841388912,1111793151,572545213,645570291,100000437158365,596083147,587246958,724405544,641236986,730519392,1168848663,146900336,1062266947,518095800,596610029,502844349,501817153,100000257743929,100000558823049,100000379524868,25833337,611837485,621991585,9318548,1472361327,1787625275,788507322,100000177520280,146900181,113500364,1379993602,766300262)) AND (`affiliations` IS NOT NULL) AND (`affiliations` <> "[]" ) ORDER BY `id` DESC) AS `tbl` GROUP BY `uid`\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1635
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DERIVED
        table: user_table
         type: ALL
possible_keys: uid
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2900
        Extra: Using filesort
2 rows in set (0.23 sec)


This query takes about 23 seconds. Compare it to this query, which takes _0.23 seconds_:


Fast:


mysql> EXPLAIN SELECT * FROM user_table INNER JOIN (         SELECT MAX(id) as id FROM `user_table` WHERE (uid IN (1022220528,516975293,610868834,1369254286,678430330,760683910,1101602962,840005,1539880648,536801853,533871879,1505692116,724332436,31803759,506549838,1507890014,593486534,1052379946,503592215,194600927, 167800172,22406819, 9433878,2716675,1120200072, 517782135,729065959,620450950,32800651,672604205,588236068, 33606106,1404468345,12633743,200800506,1413614990,585172327,1162999061,1066765095,1028653783,544385233,662588027,1137254943,1267449968,1161740972, 760405590,1545259569, 691569675,615948798, 1189330419,595498800,546053510,1563824045,1160480409, 162801352,1232341358,829698629,1486335885,1058605840,1039020025, 701572979,513147932,614621086,1292596475,731515619,1058231753, 1120200213,646377270, 71701242,502763017,1173077531,710880495, 85600035,193302738,11603373,1094790282, 501483798,733680612, 1841388912,1111793151,572545213, 645570291,100000437158365,596083147,587246958,724405544, 641236986,730519392,1168848663,146900336,1062266947,518095800, 596610029,502844349,501817153, 100000257743929,100000558823049, 100000379524868,25833337,611837485,621991585, 9318548, 1472361327,1787625275,788507322,100000177520280,146900181,113500364,1379993602,766300262)) AND (`affiliations` IS NOT NULL) AND (`affiliations` != "[]" ) GROUP BY uid  )  as SUBSELECT_TABLE ON user_table.id = SUBSELECT_TABLE.id\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 87
        Extra: 
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: user_table
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: SUBSELECT_TABLE.id
         rows: 1
        Extra: 
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: user_table
         type: range
possible_keys: uid
          key: uid
      key_len: 8
          ref: NULL
         rows: 2900
        Extra: Using where
3 rows in set (0.02 sec)



I've even seen "optimized away" on a few of the explains before. I'm having a hard time finding the query that benefited that much from it. If I do, I'll post it up here.


-Wilfried



On Apr 17, 2010, at 13:51 , Jay Pipes wrote:

> Hi!  Interesting.  Can you post the EXPLAIN SELECT for each of the
> statements?  Also, what version of Drizzle are you using?
> 
> Thanks!
> 
> jay
> 
> On Sat, Apr 17, 2010 at 1:50 PM, Wilfried Schobeiri <ws@xxxxxxxxxxxxxxx> wrote:
>> Hi all! Good hanging out with those of you at MySQLConf & Drizzle Day this past week. Hope you guys got home safe.
>> 
>> I give you a sample query. id is primary key, uid is indexed. The table has about 20 non-indexed columns. on top of taht.
>> 
>> SELECT * FROM user_facebook WHERE uid < 1000000 ORDER BY id DESC LIMIT 0,12
>> 
>> (2 min 24.54 sec)
>> 
>> vs
>> 
>> SELECT * FROM user_facebook INNER JOIN
>> (
>>        SELECT id FROM user_facebook WHERE uid < 1000000 ORDER BY id DESC LIMIT 0,12
>> )
>> as SUBSELECT_TABLE
>> ON user_facebook.id = SUBSELECT_TABLE.id
>> 
>> (1.95 sec)
>> 
>> 
>> Same results, substantial performance boost. Worth noting is that the more complicated or large the where set is (especially if there's an IN() set), or if there's any aggregation going on, the delayed join version gets almost exponentially faster. One could see 10-1000x increases in query time.
>> 
>> I'm not sure what's going on with the first query (or if we can do anything about it), but it seems like it's trying to do the order and limiting on the entire column set instead of using only what's applicable to do the ordering and limiting. If we can optimize around the primary key, perhaps there can be some massive performance gains out of it.
>> 
>> -Wilfried (nphase)
>> 
>> 
>> _______________________________________________
>> Mailing list: https://launchpad.net/~drizzle-discuss
>> Post to     : drizzle-discuss@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~drizzle-discuss
>> More help   : https://help.launchpad.net/ListHelp
>> 




References