← Back to team overview

maria-developers team mailing list archive

Semi-join processing with Duplicate Elimination

 

Hello,

The below really should be in documentation, but for now, I'm posting it here:

Semi-join processing with Duplicate Elimination
===============================================
There are three cases:

Semi-join next to outer join 
----------------------------

ot1 left join (ot10, ot11) semi-join (it1, it2)

interesting possible join orders:
 
 ot1 {ot10 ot11} it1 it2  <-- here we have just a "tail" so no problem.

 it1 ot1 {ot10, ot11} it2
  
  Here the range fully covers the inner join with all its 
  NULL-complemented record generation, etc. We need to take 
  care about
  - NULL rowids are provided for NULL-complemented rows
  - being able to jump over all this when doing weedout (can we 
    have a problem with some OJ's match flag not being reset?)

 it1 ot1 it2 {ot10 ot11}
  |                  |
  +------------------+
    dups-w range
  
  This is a special case of the previous case. Need to 
   - check dups-w for NULL-complemented record. 
      (in case of nested outer joins: check for all NULL-complemented records
      generated at this point? well they all call
      evaluate_null_complemented_record, do they?)
   - check if this works with Not-exists optimization.

 it1 ot1 it2 {ot10 ot11}
  |            |
  +------------+
   dups-w range
  
  This case is the most interesting. The question is: should we first 
  check dups-w, or care about outer joins?
  
  - dups-w check can be moved to the right.

   if we take a record combination of 

       {ot1.row it2.row ot10.row}

   we must not discard it, because we'll need to know if this record
   combination extends to
     
     {ot1.row, it2.row ot10.row ot11.row } 
   
   If it does, outer join will not have NULL-complemented record, otherwise it
   will.
   this means: no dups-w checks inside outer joins. Move them to the right
   until we've got the table that's last for all inner joins we're in.

Outer join inside semi-join
---------------------------

 ot1 semi join (it1 left join (iit2, iit3))
   
Possible join orders:

1)  ot1  it1 {iit2, iit3} 
2)  it1 ot1 {iit2, iit3} 
3)  it1 {iit2, iit3}  ot1 
4)  ot1 {iit2, iit3}  it1 

#1 and #2 have an "outside" prefix followed by "tail" of SJ-inner OJ-inner
tables. For iit3, we must first form a record for which both parts of 
WHERE and ON have been checked (or the record could be NULL-complemented)

for #3 and #3, dups-w check is done after OJ processing.
 
Semi-join inside outer join
---------------------------
- Not possible in current scheme.

BR
 Sergey
-- 
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog