← Back to team overview

maria-developers team mailing list archive

Re: 8cd65cd: MDEV-9143 JSON_xxx functions.

 

Hi, Alexey!

On Sep 13, Alexey Botchkov wrote:
> revision-id: 8cd65cd3e31aa6129573c6fbba9feb06c714e00c (mariadb-10.1.8-247-g8cd65cd)
> parent(s): 76a0ed2e03c6ae1ff791534e742c812d5e83ba63
> committer: Alexey Botchkov
> timestamp: 2016-09-13 15:48:03 +0400
> message:
> 
> MDEV-9143 JSON_xxx functions.
> 
>         Library added to operate JSON format.
>         SQL functions to handle JSON data added:
>            required by SQL standard:
>                 JSON_VALUE
>                 JSON_QUERY
>                 JSON_EXISTS
>            MySQL functions:
>                 JSON_VALID
>                 JSON_ARRAY
>                 JSON_ARRAY_APPEND
>                 JSON_CONTAINS_PATH
>                 JSON_EXTRACT
>                 JSON_OBJECT
>                 JSON_QUOTE
>                 JSON_MERGE
>          Some MySQL functions still missing, but no specific
>          difficulties to implement them too in same manner.
>          JSON_TABLE of the SQL standard is missing as it's not
>          clean how and whether we'd like to implement it.

Good comment, thanks.

>  include/CMakeLists.txt        |    1 +
>  include/json_lib.h            |  338 ++++++++++
>  mysql-test/r/func_json.result |   99 +++
>  mysql-test/t/func_json.test   |   44 ++
>  sql/CMakeLists.txt            |    2 +-
>  sql/item.h                    |   15 +
>  sql/item_create.cc            |  351 ++++++++++
>  sql/item_jsonfunc.cc          |  898 ++++++++++++++++++++++++++
>  sql/item_jsonfunc.h           |  240 +++++++
>  sql/item_xmlfunc.cc           |   21 +-
>  sql/item_xmlfunc.h            |    5 -
>  sql/sql_yacc.yy               |    4 +-
>  strings/CMakeLists.txt        |    2 +-
>  strings/ctype-ucs2.c          |   19 +-
>  strings/json_lib.c            | 1426 +++++++++++++++++++++++++++++++++++++++++
>  15 files changed, 3433 insertions(+), 32 deletions(-)

Is the json library code the same as in the previous commit?
Or should I look at it again?

And *please* add unit tests for the json library.

> diff --git a/include/json_lib.h b/include/json_lib.h
> new file mode 100644
> index 0000000..592096f
> --- /dev/null
> +++ b/include/json_lib.h
...
> +    do
> +    {
> +      // The parser has read next piece of JSON
> +      // and set fields of j_eng structure accordingly.
> +      // So let's see what we have:

Oops. Didn't notice that earlier, sorry.
No C++ comments in C files, please, some older compilers don't like that.

> diff --git a/mysql-test/r/func_json.result b/mysql-test/r/func_json.result
> new file mode 100644
> index 0000000..1ebe0b2
> --- /dev/null
> +++ b/mysql-test/r/func_json.result
> @@ -0,0 +1,99 @@
> +select json_value('{"key1":[1,2,3]}', '$.key1');
> +json_value('{"key1":[1,2,3]}', '$.key1')
> +NULL

I'd expect an error here, but the standard explictly says:

  5) If <JSON value error behavior> is not specified, then NULL ON ERROR
  is implicit.

that is, returning NULL on errors is correct.
Would be nice to have a comment about it in the code (around
Item_func_json_value::val_str), otherwise someone might change this behavior to
return an error.

> +select json_value('{"key1": [1,2,3], "key1":123}', '$.key1');
> +json_value('{"key1": [1,2,3], "key1":123}', '$.key1')
> +123

Eh... Okay, so you show the *last* key, right?
SQL standard says it's implementation defined, so I suppose that's fine.

> +select json_object("ki", 1, "mi", "ya");
> +json_object("ki", 1, "mi", "ya")
> +{"ki": 1, "mi": "ya"}

add tests for json constructing functions (json_object,
json_array_append, etc) with key/values that need quoting or escaping.

I mean, tests to show that json_object (for example) it more complex
than concat().

> diff --git a/mysql-test/t/func_json.test b/mysql-test/t/func_json.test
> new file mode 100644
> index 0000000..a5b84e0c
> --- /dev/null
> +++ b/mysql-test/t/func_json.test
...
> +
> +select json_merge('string', 123);

how comes this test is not in the result file?

> diff --git a/sql/item_jsonfunc.h b/sql/item_jsonfunc.h
> new file mode 100644
> index 0000000..aff19ee
> --- /dev/null
> +++ b/sql/item_jsonfunc.h
> +class json_path_with_flags
> +{
> +public:
> +  json_path_t p;
> +  bool constant;
> +  bool parsed;
> +  json_path_step_t *cur_step;
> +  void set_constant_flag(bool s_constant)
> +  {
> +    constant= s_constant;
> +    parsed= FALSE;

why set_constant_flag() resets 'parsed'?
Better to set parsed=false in the constructor.

> +  }
> +};
> +
> diff --git a/sql/item_jsonfunc.cc b/sql/item_jsonfunc.cc
> new file mode 100644
> index 0000000..5649989
> --- /dev/null
> +++ b/sql/item_jsonfunc.cc
...
> +/*
> +  Appends ASCII string to the String object taking it's charset in
> +  consideration.
> +*/
> +static int st_append_ascii(String *s, const char *ascii, uint ascii_len)

Why did you do that, instead of using String::append() ?
I'd use String::append(), and if that would happen to be too slow, I'd
implement String::append_ascii(), not something json-specific.

...
> +/*
> +  Appends arbitrary String to the JSON string taking charsets in
> +  consideration.
> +*/
> +static int st_append_escaped(String *s, const String *a)
> +{
> +  /*
> +    In the worst case one character from the 'a' string
> +    turns into '\uXXXX\uXXXX' which is 12.

how comes? add an example, please. Like "for example, character x'1234'
in the charset A becomes '\u1234\u5678' if JSON string is in the charset B"

> +  */
> +  int str_len= a->length() * 12 * s->charset()->mbmaxlen /
> +               a->charset()->mbminlen;
> +  if (!s->reserve(str_len, 1024) &&
> +      (str_len=
> +         json_escape(a->charset(), (uchar *) a->ptr(), (uchar *)a->end(),
> +                     s->charset(),
> +                     (uchar *) s->end(), (uchar *)s->end() + str_len)) > 0)
> +  {
> +    s->length(s->length() + str_len);
> +    return 0;
> +  }
> +
> +  return a->length();
> +}
...
> +longlong Item_func_json_exists::val_int()
> +{
> +  json_engine_t je;
> +  String *js= args[0]->val_str(&tmp_js);
> +
> +  if (!path.parsed)
> +  {
> +    String *s_p= args[1]->val_str(&tmp_path);
> +    if (s_p &&
> +        json_path_setup(&path.p, s_p->charset(), (const uchar *) s_p->ptr(),
> +                        (const uchar *) s_p->ptr() + s_p->length()))

1. why wouldn't you do it in fix_length_and_dec for constant paths?
2. please add tests when path is not constant

> +      goto err_return;
> +    path.parsed= path.constant;
> +  }
> +
> +  if ((null_value= args[0]->null_value || args[1]->null_value))
> +  {
> +    null_value= 1;
> +    return 0;
> +  }
> +
> +  null_value= 0;
> +  json_scan_start(&je, js->charset(),(const uchar *) js->ptr(),
> +                  (const uchar *) js->ptr() + js->length());
> +
> +  path.cur_step= path.p.steps;
> +  if (json_find_value(&je, &path.p, &path.cur_step))
> +  {
> +    if (je.s.error)
> +      goto err_return;
> +    return 0;
> +  }
> +
> +  return 1;
> +
> +err_return:
> +  null_value= 1;
> +  return 0;
> +}
...
> +longlong Item_func_json_contains_path::val_int()
> +{
> +  String *js= args[0]->val_str(&tmp_js);
> +  json_engine_t je;
> +  uint n_arg;
> +  longlong result;
> +
> +  if ((null_value= args[0]->null_value))
> +    return 0;
...
> +  result= !mode_one;
> +  for (n_arg=2; n_arg < arg_count; n_arg++)
> +  {
> +    json_path_with_flags *c_path= paths + n_arg - 2;
> +    if (!c_path->parsed)
> +    {
> +      String *s_p= args[n_arg]->val_str(tmp_paths+(n_arg-2));
> +      if (s_p &&
> +          json_path_setup(&c_path->p,s_p->charset(),(const uchar *) s_p->ptr(),
> +                          (const uchar *) s_p->ptr() + s_p->length()))
> +        goto error;
> +      c_path->parsed= TRUE;


eh? not c_path->parsed= c_path->constant ?

> +    }
> +
> +    json_scan_start(&je, js->charset(),(const uchar *) js->ptr(),
> +                    (const uchar *) js->ptr() + js->length());
> +
> +    c_path->cur_step= c_path->p.steps;
> +    if (json_find_value(&je, &c_path->p, &c_path->cur_step))
> +    {
> +      /* Path wasn't found. */
> +      if (je.s.error)
> +        goto error;
> +
> +      if (!mode_one)
> +      {
> +        result= 0;
> +        break;
> +      }
> +    }
> +    else if (mode_one)
> +    {
> +      result= 1;
> +      break;
> +    }
> +  }
> +
> +
> +  return result;
> +
> +error:
> +  null_value= 1;
> +  return 0;
> +}

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx