maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #12807
Re: Accessing read records during join for condition pushdown
-
To:
Sergey Petrunia <sergey@xxxxxxxxxxx>
-
From:
"Berrocal, Eduardo" <eduardo.berrocal@xxxxxxxxx>
-
Date:
Thu, 22 Jul 2021 00:28:09 +0000
-
Accept-language:
en-US
-
Arc-authentication-results:
i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none
-
Arc-message-signature:
i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DG68420JqPluqQmkPlJ097/hONIyvuQ0o3id5h2IBdw=; b=CDzXsS6uEvcdWgX4xhW39kPbQ/LnLHhSCwPpA9FgbM5AZMoAlFmiHMDwVVKResNQzXQ6U0QGzgoPg6Uf20HAurtVCRU9zqtx2ofAmVmKo67O/201WY7pOG24lP53xXxtIJ97mY2aErTC8a9P5/HXfRfTqGSTt7RkxKHvyK5sDY1CAdkTwxI/KazhWBox8BZx+VgkvjwkLFI9Hl6+IV7wctDtdLXreavt19deHysdpSAN9wK1AW1Y8TqIdKN0DGkNzzN1XeNGzHJxGKG8C29HmOs0X7ck9pFYbzZko2DvYbkQ6mCt1OHr5W3uy77PsZgKzKpnCUXeoC5TglN1zGSYaA==
-
Arc-seal:
i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G8YiR6t9yGVMrOXMoGCBvFtB726nR+663UzR/Wk7YtLd2+HCVNE+k8+M0OiLrveJf80i3XfxRbEoqi4JFrpvBndW5PGAf2Npmli+nFx5s+iLf72G/v2pVgkGO2O5CDHZ++z2RHXcN1Xd3+rvm3wJipy7Uv8vHvT2a7m+WOrZGVvTPEREaq72Q8KholNUeo8Ltg7BVfw6LsmNgu3lFlQjHaVc2044Hc0JvFt/yJ9KcFEqXi5w/n5Tt9SqNO9XtTAqtMhV8wx2VcaM8jwSHd1cNWX/AEqSerSN4RYIka9CqspNkOYvTeCDekMz/kagPML06H+ngtHFxn8cU27nPolGTw==
-
Authentication-results:
mariadb.com; dkim=none (message not signed) header.d=none;mariadb.com; dmarc=none action=none header.from=intel.com;
-
Cc:
"maria-developers@xxxxxxxxxxxxxxxxxxx" <maria-developers@xxxxxxxxxxxxxxxxxxx>
-
Dlp-product:
dlpe-windows
-
Dlp-reaction:
no-action
-
Dlp-version:
11.5.1.3
-
In-reply-to:
<20210714092548.GB19790@pslp4>
-
Thread-index:
Addyu53/Di069vVzQH+2/nbHVe69JABEUwuAABbMdoAAtsX4gAAcMSkwAEeQmQABf37j0A==
-
Thread-topic:
[Maria-developers] Accessing read records during join for condition pushdown
Hello Sergei,
Thanks a lot for your response. The way you showed seems to be the way to go.
There is just one problem: all those functions are protected inside the JOIN_CACHE (and JOIN_CACHE_*)
Classes. Which seems to suggest that code changes to the MariaDB core code is a necessity in this case.
Cheers,
Eduardo Berrocal García de Carellán
Senior Software Engineer
Intel Corporation | intel.com
-----Original Message-----
From: Sergey Petrunia <sergey@xxxxxxxxxxx>
Sent: Wednesday, July 14, 2021 2:26 AM
To: Berrocal, Eduardo <eduardo.berrocal@xxxxxxxxx>
Cc: Sergei Golubchik <serg@xxxxxxxxxxx>; maria-developers@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Maria-developers] Accessing read records during join for condition pushdown
Hi Eduardo,
On Mon, Jul 12, 2021 at 11:30:35PM +0000, Berrocal, Eduardo wrote:
> "A question: do you intend to push down and check conditions in arbitrary form?
> In case they are just equalities, it might make sense for storage engine to pretend having an index and then Batched Key
> Access optimization will be used, and the SQL layer will provide batches of keys to lookup."
>
>
> The answer is yes. I would like to check more than just equalities, if possible. The Batched key access is something worth looking into, in case that that can be used as a way to speed up at least conditions with equalities.
>
> A follow up question for you: When you say that traversing the linked
> join buffers is difficult... how difficult exactly? Like a linked list, a tree...
> ? I still think it is worth exploring in our case given that the
> speedup can be huge if we can avoid a full scan of a table with TBs of data.
>
Iterating the join buffer shouldn't be hard. It is just a memory area which is filled with variable-size records which have certain fields.
Debugging a query that uses join buffers, one can see that the iteration over contents of the join buffer is done in JOIN_CACHE::join_matching_records()
using these calls:
prepare_look_for_matches();
while (get_next_candidate_for_match()) {
read_next_candidate_for_match();
}
The code in JOIN_CACHE::join_matching_records() is complex, because it does other things besides just iterating the cache:
* Checking of "first match" flag for queries which only need one matching row from the second table.
* Handling outer JOINs, where we apply the restriction from ON expression always, and apply the restriction from WHERE expression only after we've figured out if LEFT JOIN has a match.
AFAIU, you should ignore these inside the storage engine (produce all matches, don't check the Item_func_trig_cond items). This will make your copy of join_matching_records simpler.
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
Follow ups
References