← Back to team overview

maria-developers team mailing list archive

Re: 734aee3cd3c: MDEV-27018 IF and COALESCE lose "json" property

 

Hi, Alexander!

On Dec 28, Alexander Barkov wrote:
> revision-id: 734aee3cd3c (mariadb-10.5.13-33-g734aee3cd3c)
> parent(s): 2776635cb98
> author: Alexander Barkov
> committer: Alexander Barkov
> timestamp: 2021-12-27 17:54:31 +0400
> message:
> 
> MDEV-27018 IF and COALESCE lose "json" property
> 
> Hybrid functions (IF, COALESCE, etc) did not preserve the JSON property
> from their arguments. The same problem was repeatable for single row subselects.
> 
> The problem happened because the method Item::is_json_type() was inconsistently
> implemented across the Item hierarchy. For example, Item_hybrid_func
> and Item_singlerow_subselect did not override is_json_type().

Okay. The approach is fine, but see some questions below:

> diff --git a/sql/field.cc b/sql/field.cc
> index 2c768527ced..8fa3bbd538c 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -7277,6 +7277,19 @@ bool Field_longstr::send(Protocol *protocol)
>  }
>  
>  
> +const Type_handler *Field_string::type_handler() const
> +{
> +  if (is_var_string())
> +    return &type_handler_var_string;
> +  /*
> +    This is a temporary solution and will be fixed soon (in 10.9?).
> +    Type_handler_string_json will provide its own Field_string_json.

why?

> +  */
> +  if (Type_handler_json_common::has_json_valid_constraint(this))

could you please use names that don't depend on how exactly we detect
json? especially if the goal is to move to FORMAT JSON syntax

> +   return &type_handler_string_json;
> +  return &type_handler_string;
> +}
> +
>  	/* Copy a string and fill with space */
>  
>  int Field_string::store(const char *from, size_t length,CHARSET_INFO *cs)
> diff --git a/sql/sql_type.cc b/sql/sql_type.cc
> index c1801c1ae3e..3b2753d80a6 100644
> --- a/sql/sql_type.cc
> +++ b/sql/sql_type.cc
> @@ -1755,19 +1755,110 @@ const Type_handler *Type_handler_typelib::cast_to_int_type_handler() const
>  
>  /***************************************************************************/
>  
> +class Recursive_type_pair_iterator
> +{
> +  const Type_handler *m_a;
> +  const Type_handler *m_b;
> +  uint m_switched_to_base_count;
> +public:
> +  Recursive_type_pair_iterator(const Type_handler *a,
> +                               const Type_handler *b,
> +                               uint switched_to_base_count= 0)
> +   :m_a(a), m_b(b), m_switched_to_base_count(switched_to_base_count)
> +  { }
> +  const Type_handler *a() const { return m_a; }
> +  const Type_handler *b() const { return m_b; }
> +  Recursive_type_pair_iterator base() const
> +  {
> +    Recursive_type_pair_iterator res(m_a->type_handler_base(),
> +                                     m_b->type_handler_base());
> +    res.m_switched_to_base_count= (res.m_a != NULL) + (res.m_b != NULL);
> +    if (res.m_a == NULL)
> +      res.m_a= m_a;
> +    if (res.m_b == NULL)
> +      res.m_b= m_b;
> +    return res;
> +  }
> +  bool done() const
> +  {
> +    switch (m_switched_to_base_count)
> +    {
> +    case 2:
> +      /*
> +        Expression:
> +          COALESCE(TYPE1, TYPE2)
> +
> +        where both type handlers derive from some other handlers, e.g.
> +          VARCHAR -> TYPE1
> +          TEXT    -> TYPE2
> +
> +        Continue aggregation as:
> +          SELECT COALESCE(VARCHAR, TEXT)
> +
> +        Note, we now have only one level of data type inheritance,
> +          e.g. VARCHAR -> VARCHAR/JSON
> +
> +        This code branch will change when we have longer inheritance chains:
> +          A0 -> A1 -> A2 -> A3
> +          B0 -> B1 -> B2
> +
> +        Functions like COALESE(A2, B2) will need to do full overload resolution:
> +        - iterate through all pairs from the below matrix
> +        - find all existing candidates
> +        - resolve ambiguities in case multiple candidates, etc.

and I don't see how you're going to do that. you need to try
m_a+m_b.base(), m_a.base()+m_b, and m_a.base()+m_b.base().
And that's not what you're doing.

I suggest to add an assert that m_switched_to_base_count < 2
and stop trying to solve complex cases until we have at least one.

> +
> +            A0 A1 A2 A3
> +         B0  .  .  .  .
> +         B1  .  .  .  .
> +         B2  .  .  .  .
> +      */
> +      return false;
> +
> +    case 1:
> +      /*
> +         Expression:
> +           COALESCE(TYPE1, TEXT)
> +
> +         If only one handler derives from some other handler:
> +           VARCHAR-> TYPE1
> +
> +         Continue aggregation as:
> +           SELECT COALESCE(VARCHAR, TEXT)
> +      */
> +      return false;
> +
> +    case 0:
> +    default:
> +      /*
> +        Non of the two handlers are derived from other handlers.
> +        Nothing left to try.
> +      */
> +      return true;
> +    }
> +  }
> +};

> diff --git a/sql/sql_type_json.h b/sql/sql_type_json.h
> index 6c4ee8cb2eb..4a394809a06 100644
> --- a/sql/sql_type_json.h
> +++ b/sql/sql_type_json.h
> @@ -21,18 +21,145 @@
...
> +  bool Item_append_extended_type_info(Send_field_extended_metadata *to,
> +                                      const Item *item) const
> +  {
> +    return set_format_name(to); // Send "format=json" in the protocol
> +  }
> +
> +  bool Item_hybrid_func_fix_attributes(THD *thd,
> +                                       const char *name,
> +                                       Type_handler_hybrid_field_type *hybrid,
> +                                       Type_all_attributes *attr,
> +                                       Item **items, uint nitems)
> +                                       const override
> +  {
> +    if (BASE::Item_hybrid_func_fix_attributes(thd, name, hybrid, attr,
> +                                              items, nitems))
> +      return true;
> +    /*
> +      The above call can change the type handler on "hybrid", e.g.
> +      choose a proper BLOB type handler according to the calculated max_length.
> +      Convert general purpose string type handler to its JSON counterpart.
> +      This makes hybrid functions preserve JSON data types, e.g.:
> +        COALESCE(json_expr1, json_expr2) -> JSON
> +    */
> +    hybrid->set_handler(json_type_handler_from_generic(hybrid->type_handler()));

I don't get it. It's supposed to "fix attributes", like signed/unsigned,
max length, etc. The correct type handler should've been set already,
otherwise this type handler method wouldn't even be called.

> +    return false;
> +  }
>  };
>  
> +
> +class Type_handler_string_json:
> +  public Type_handler_general_purpose_string_to_json<Type_handler_string,
> +                                                     type_handler_string>
> +{ };
> diff --git a/sql/sql_type_json.cc b/sql/sql_type_json.cc
> index a804366ec03..a84f720bcc9 100644
> --- a/sql/sql_type_json.cc
> +++ b/sql/sql_type_json.cc
> @@ -20,17 +20,111 @@
>  #include "sql_class.h"
>  
>  
> -Type_handler_json_longtext  type_handler_json_longtext;
> +Named_type_handler<Type_handler_string_json>
> +  type_handler_string_json("char/json");

do you mean these names are not shown anywhere?

> +Named_type_handler<Type_handler_varchar_json>
> +  type_handler_varchar_json("varchar/json");
> +
> +Named_type_handler<Type_handler_tiny_blob_json>
> +  type_handler_tiny_blob_json("tinyblob/json");
> +
> +Named_type_handler<Type_handler_blob_json>
> +  type_handler_blob_json("blob/json");
> +
> +Named_type_handler<Type_handler_medium_blob_json>
> +  type_handler_medium_blob_json("mediumblob/json");
> +
> +Named_type_handler<Type_handler_long_blob_json>
> +  type_handler_long_blob_json("longblob/json");
...
> +/*
> +  This method ressembles what Field_blob::type_handler()

resembles
(also below)

> +  does for general purpose BLOB type handlers.
> +*/
> +const Type_handler *
> +Type_handler_json_common::json_blob_type_handler_by_length_bytes(uint len)
> +{
> +  switch (len) {
> +  case 1: return &type_handler_tiny_blob_json;
> +  case 2: return &type_handler_blob_json;
> +  case 3: return &type_handler_medium_blob_json;
> +  }
> +  return &type_handler_long_blob_json;
> +}

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx


Follow ups