maria-developers team mailing list archive

Thread
Date

Re: MDEV-9369 IN operator with ( num, NULL ) gives inconsistent result

To: Sergei Golubchik <serg@xxxxxxxxxxx>
From: Alexander Barkov <bar@xxxxxxxxxxx>
Date: Mon, 21 Mar 2016 11:26:05 +0400
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20160316175128.GA23241@meddwl.fritz.box>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

Hi Sergei,


On 03/16/2016 09:51 PM, Sergei Golubchik wrote:
> Hi, Alexander!
> 
> On Feb 08, Alexander Barkov wrote:
>> Hello Sergei,
>>
>> Please review a patch for MDEV-9369.
>>
>> Briefly, it fixes cmp_item::cmp() to return three possible values: 
>> FALSE, TRUE, UNKNOWN. These values are then recursively pulled to
>> the very top level, to the owner Item_xxx.
>>
>> The owner Item_func_in (as well as Item_func_case and Item_func_equal)
>> then further "translate" these tree-values logic into the traditional
>> val_int()+null_value combination using some additional information.
>>
>> For example, Item_func_in::val_int() uses "negated" as an additional
>> information.
>>
>> The patch looks Ok for me.
> 
> Looks ok to me too. A couple of stylistic comments, see below.
> ok to push afterwards.
> 
>> diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
>> index d5f5087..39003de 100644
>> --- a/sql/item_cmpfunc.cc
>> +++ b/sql/item_cmpfunc.cc
>> @@ -3837,15 +3843,15 @@ void cmp_item_decimal::store_value(Item *item)
>>    /* val may be zero if item is nnull */
>>    if (val && val != &value)
>>      my_decimal2decimal(val, &value);
>> +  set_null_value(item->null_value);
>>  }
>>  
>>  
>>  int cmp_item_decimal::cmp(Item *arg)
>>  {
>>    my_decimal tmp_buf, *tmp= arg->val_decimal(&tmp_buf);
>> -  if (arg->null_value)
>> -    return 1;
>> -  return my_decimal_cmp(&value, tmp);
>> +  return (m_null_value || arg->null_value) ?
> 
> if you access m_null_value directly, you don't need a setter for it.
> it'll be less confusing if you replace set_null_value(X) with m_null_value=X.
> 
> alternatively, you can add a getter is_null_value(). But I personally
> wouldn't do it, cmp_xxx is a small set of simple classes, no need to get
> too enterprisey there.


Removed set_null_value(), changed to use m_null_value directly.


> 
>> +    UNKNOWN : (my_decimal_cmp(&value, tmp) != 0);
>>  }
>>  
>>  
>> @@ -3892,10 +3900,10 @@ cmp_item *cmp_item_datetime::make_same()
>>  }
>>  
>>  
>> -bool Item_func_in::nulls_in_row()
>> +bool Item_func_in::list_contains_null()
> 
> I found the old name clear enough. But ok, whatever...
> 
>>  {
>>    Item **arg,**arg_end;
>> -  for (arg= args+1, arg_end= args+arg_count; arg != arg_end ; arg++)
>> +  for (arg= args + 1, arg_end= args+arg_count; arg != arg_end ; arg++)
>>    {
>>      if ((*arg)->null_inside())
>>        return 1;
>> @@ -4010,6 +4018,32 @@ void Item_func_in::fix_length_and_dec()
>>      }
>>    }
>>  
>> +  /*
>> +    First conditions for bisection to be possible:
>> +     1. All types are similar, and
>> +     2. All expressions in <in value list> are const
>> +  */
>> +  bool bisection_possible=
>> +    type_cnt == 1 &&                                   // 1
>> +    const_itm;                                         // 2
>> +  if (bisection_possible)
>> +  {
>> +    /*
>> +      In the presence of NULLs, the correct result of evaluating this item
>> +      must be UNKNOWN or FALSE. To achieve that:
>> +      - If type is scalar, we can use bisection and the "have_null" boolean.
>> +      - If type is ROW, we will need to scan all of <in value list> when
>> +        searching, so bisection is impossible. Unless:
>> +        3. UNKNOWN and FALSE are equivalent results
>> +        4. Neither left expression nor <in value list> contain any NULL value
>> +      */
>> +
>> +    if (cmp_type == ROW_RESULT &&
>> +        !((is_top_level_item() && !negated) ||              // 3
>> +          (!list_contains_null() && !args[0]->maybe_null))) // 4
>> +      bisection_possible= false;
> 
> I think the condition will be easier to understand if you remove the
> negation of that complex condition in parentheses:
> 
>     if (cmp_type == ROW_RESULT &&
>         ((!is_top_level_item() || negated) &&             // 3
>           (list_contains_null() || args[0]->maybe_null))) // 4
> 
> I find this ^^^ much easier to understand.

Agreed. Fixed to use your version.

Thanks for review! I pushed the patch.

> 
>> +  }
>> +
>>    if (type_cnt == 1)
>>    {
>>      if (cmp_type == STRING_RESULT && 
> 
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
>

References

Re: MDEV-9369 IN operator with ( num, NULL ) gives inconsistent result
From: Sergei Golubchik, 2016-03-16