maria-developers team mailing list archive

Thread
Date
Input on Optimizer trace code, part #2

To: Varun Gupta <varun.gupta@xxxxxxxxxxx>
From: Sergey Petrunia <sergey@xxxxxxxxxxx>
Date: Thu, 14 Feb 2019 14:30:58 +0300
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.9.4 (2018-02-28)
Hi Varun,

Please find more input below.

> --- 10.4-optimizer-trace-orig/sql/sql_derived.cc	2019-02-10 14:25:59.217015365 +0300
> +++ 10.4-optimizer-trace-cl/sql/sql_derived.cc	2019-02-12 14:04:05.446571305 +0300
> @@ -486,6 +486,24 @@ exit_merge:
>    DBUG_RETURN(res);
>  
>  unconditional_materialization:
> +
> +  if (unlikely(thd->trace_started()))
> +  {
> +    /*
> +     Add to the optimizer trace the change in choice for merged
> +     derived tables/views to materialised ones.
> +    */
> +    Json_writer_object trace_wrapper(thd);
> +    Json_writer_object trace_derived(thd, derived->is_derived() ?
> +                                       "derived" : "view");
> +    trace_derived.add("table", derived->alias.str ? derived->alias.str : "<NULL>")
> +                 .add_select_number(derived->get_unit()->
> +                                    first_select()->select_number)
> +                 .add("initial_choice", "merged")
> +                 .add("final_choice", "materialized")
> +                 .add("cause", cause);

Please add test coverage for this (I cannot find it in the testsuite).


> +select * from v1	{
> +  "steps": [
> +    {
> +      "join_preparation": {
> +        "select_id": 1,
> +        "steps": [
> +          {
> +            "view": {
> +              "table": "v1",
> +              "select_id": 2,
> +              "merged": true

The above looks a bit odd, "view", inside it "table", and then either
"materilized" or "merged". I would prefer 
  "algorithm": "materialize"
or
  "algorithm": "merge"

> diff -urpN '--exclude=.*' 10.4-optimizer-trace-orig/sql/sql_select.cc 10.4-optimizer-trace-cl/sql/sql_select.cc
> --- 10.4-optimizer-trace-orig/sql/sql_select.cc	2019-02-10 14:25:59.225015425 +0300
> +++ 10.4-optimizer-trace-cl/sql/sql_select.cc	2019-02-12 14:04:05.450571322 +0300
> @@ -345,6 +349,40 @@ bool dbug_user_var_equals_int(THD *thd,
>  }
>  #endif
>  
> +static void trace_table_dependencies(THD *thd,
> +                                     JOIN_TAB *join_tabs, uint table_count)
> +{
> +  Json_writer_object trace_wrapper(thd);
> +  Json_writer_array trace_dep(thd, "table_dependencies");
> +  for (uint i = 0; i < table_count; i++)
> +  {
> +    TABLE_LIST *table_ref = join_tabs[i].tab_list;
> +    Json_writer_object trace_one_table(thd);
> +    trace_one_table.add_table_name(&join_tabs[i]);
> +    trace_one_table.add("row_may_be_null",
> +                       (bool)table_ref->table->maybe_null);
> +    const table_map map = table_ref->get_map();
> +    DBUG_ASSERT(map < (1ULL << table_count));
> +    for (uint j = 0; j < table_count; j++)
> +    {
> +      if (map & (1ULL << j))
> +      {
> +        trace_one_table.add("map_bit", static_cast<longlong>(j));
> +        break;
> +      }
> +    }
> +    Json_writer_array depends_on(thd, "depends_on_map_bits");
> +    static_assert(sizeof(table_ref->get_map()) <= 64,
> +                  "RAND_TABLE_BIT may be in join_tabs[i].dependent, so we test "
> +                  "all 64 bits.");
sizeof() is in bytes, not bits, so the above assert actually checks for
whether table_map is 64 bytes large (which is incorrect).

I would suggest to switch to using Table_map_iterator and forget about the
64-byte limitation (and the value of RAND_TABLE_BIT, too)

> +    for (uint j = 0; j < 64; j++)
> +    {
> +      if (join_tabs[i].dependent & (1ULL << j))
> +        depends_on.add(static_cast<longlong>(j));
> +    }

> --- 10.4-optimizer-trace-orig/sql/opt_table_elimination.cc	2019-02-10 14:25:59.209015304 +0300
> +++ 10.4-optimizer-trace-cl/sql/opt_table_elimination.cc	2019-02-12 14:04:05.442571289 +0300
> @@ -522,7 +524,8 @@ eliminate_tables_for_list(JOIN *join,
>                            List<TABLE_LIST> *join_list,
>                            table_map tables_in_list,
>                            Item *on_expr,
> -                          table_map tables_used_elsewhere);
> +                          table_map tables_used_elsewhere,
> +                          Json_writer_array* eliminate_tables);

I think I wrote this already - I think 'eliminate_tables' is a very poor name
for the argument-  it doesn't show that this is just a trace output object. If
something is named 'eliminate_tables' I would expect it to play some active
role (e.g. being a list of tables we are trying to eliminate or something like
that).
Please change to something like trace, trace_eliminated, trace_for_eliminated,
...

> @@ -2340,8 +2377,13 @@ int pull_out_semijoin_tables(JOIN *join)
>  bool optimize_semijoin_nests(JOIN *join, table_map all_table_map)
>  {
>    DBUG_ENTER("optimize_semijoin_nests");
> +  THD *thd= join->thd;
>    List_iterator<TABLE_LIST> sj_list_it(join->select_lex->sj_nests);
>    TABLE_LIST *sj_nest;
> +  Json_writer_object wrapper(thd);
> +  Json_writer_object trace_semijoin_nest(thd,
> +                              "execution_plan_for_potential_materialization");
> +  Json_writer_array trace_steps_array(thd, "steps");
>    while ((sj_nest= sj_list_it++))
>    {
>      /* semi-join nests with only constant tables are not valid */

The most common output this produces is a useless

            "execution_plan_for_potential_materialization": {
              "steps": []
            }

Please add "if (!join->select_lex->sj_nests.elements) \n DBUG_RETURN(FALSE);"
to avoid it.

Please also add a testcase where a non-empty output is produced (it's nice to
have a semi-representative set of outputs in main/opt_trace*result).


> --- 10.4-optimizer-trace-orig/mysql-test/main/opt_trace.result	1970-01-01 03:00:00.000000000 +0300
> +++ 10.4-optimizer-trace-cl/mysql-test/main/opt_trace.result	2019-02-12 14:04:05.438571272 +0300
> +QUERY	TRACE	MISSING_BYTES_BEYOND_MAX_MEM_SIZE	INSUFFICIENT_PRIVILEGES
> +EXPLAIN SELECT DISTINCT a FROM t1	{
...
> +                  "best_group_range_summary": {
> +                    "type": "index_group",
> +                    "index": "a",
> +                    "group_attribute": null,
> +                    "min_aggregate": false,
> +                    "max_aggregate": false,
> +                    "distinct_aggregate": false,
> +                    "rows": 5,
> +                    "cost": 7.5,
> +                    "key_parts_used_for_access": ["a"],
> +                    "ranges": [],
> +                    "chosen": true
> +                  },
How come group_attribute is NULL? Loose Index Scan requires groping on a
column. Are you sure you are printing the right value?

--> - 10.4-optimizer-trace-orig/sql/opt_trace.cc	1970-01-01 03:00:00.000000000 +0300
> +++ 10.4-optimizer-trace-cl/sql/opt_trace.cc	2019-02-12 14:04:05.442571289 +0300
> ...
> +void opt_trace_print_expanded_query(THD *thd, SELECT_LEX *select_lex,
> +                                    Json_writer_object *writer)
> +
> +{
> +  if (!thd->trace_started())
> +    return;
> +  char buff[1024];
> +  String str(buff, sizeof(buff), system_charset_info);
> +  str.length(0);
Please use StringBuffer<1024> instead.

> +  select_lex->print(thd, &str,
> +                    enum_query_type(QT_TO_SYSTEM_CHARSET |
> +                                    QT_SHOW_SELECT_NUMBER |
> +                                    QT_ITEM_IDENT_SKIP_DB_NAMES |
> +                                    QT_VIEW_INTERNAL
> +                                    ));
> +  /*
> +    The output is not very pretty lots of back-ticks, the output
> +    is as the one in explain extended , lets try to improved it here.
> +  */
> +  writer->add("expanded_query", str.c_ptr_safe(), str.length());
> +}
Did you try using QT_EXPLAIN? (if it doesn't work, I'd like to know why)

...

> +    trace->missing_privilege();
> +  return;
> +}
A 'return;' at the end of void function looks very confusing. Please remove it

> +void opt_trace_disable_if_no_stored_proc_func_access(THD *thd, sp_head *sp)
> +{
> +  if (likely(!(thd->variables.optimizer_trace &
> +               Opt_trace_context::FLAG_ENABLED)) ||
> +      thd->system_thread)
> +    return;
> +
> +  Opt_trace_context *const trace = &thd->opt_trace;
> +  if (!thd->trace_started())
> +    return;
> +  bool full_access;
> +  Security_context *const backup_thd_sctx = thd->security_context();
> +  thd->set_security_context(&thd->main_security_ctx);
> +  const bool rc = check_show_routine_access(thd, sp, &full_access) || !full_access;
> +  thd->set_security_context(backup_thd_sctx);
> +  if (rc)
> +    trace->missing_privilege();
> +  return;
> +}
A 'return;' at the end of void function looks very confusing. Please remove
it.
Also, check the assignments in the function: they should follow coding style
and be "x= y" and not "x = y".


> +int fill_optimizer_trace_info(THD *thd, TABLE_LIST *tables, Item *)
> +{
> +  TABLE *table = tables->table;
> +  Opt_trace_info info;
> +
> +  /*  get_values of trace, query , missing bytes and missing_priv
> +
> +      @todo: Need an iterator here to walk over all the traces
> +  */
> +  Opt_trace_context* ctx= &thd->opt_trace;
> +
> +  if (thd->opt_trace.empty())
> +  {

So thd->opt_trace.empty() returns FALSE when there are NO traces.
And it returns TRUE when there are traces.
THIS IS EXTREMELY CONFUSING! Please fix it.

> +    Opt_trace_stmt *stmt= ctx->get_top_trace();
> +    stmt->fill_info(&info);
> +
> +    table->field[0]->store(info.query_ptr, static_cast<uint>(info.query_length),
> +                           info.query_charset);

...

> +void Json_writer::add_str(Item *item)
> +{
> +  if (item)
> +  {
> +    THD *thd= current_thd;
> +    char buff[256];
> +    String str(buff, sizeof(buff), system_charset_info);
> +    str.length(0);

Please use StringBuffer.

> +  ~Opt_trace_stmt()
> +  {
> +    delete current_json;
> +    missing_priv= false;
> +    ctx= NULL;
> +    I_S_disabled= 0;
> +  }
What is this for? Why set member variables in destructor? This either does
nothing, or is a unsuccessful attempt to prevent "reading freed memory"-type
crashes. Please remove this.

 
> +private:
> +  Opt_trace_context *ctx;
> +  String query;  // store the query sent by the user
> +  Json_writer *current_json; // stores the trace
> +  bool missing_priv;  ///< whether user lacks privilege to see this trace
> +  uint I_S_disabled;
> +};

What is I_S_disabled and why is it called like that? Please document as it is
totally confusing.


> +void opt_trace_disable_if_no_security_context_access(THD *thd)
> ...
> +  if (!thd->trace_started())
> +  {
> +    /*
> +      @@optimizer_trace has "enabled=on" but trace is not started.
> +      Either Opt_trace_start ctor was not called for our statement (3), or it
> +      was called but at that time, the variable had "enabled=off" (4).
> +
> +      There are no known cases of (3).
> +
> +      (4) suggests that the user managed to change the variable during
> +      execution of the statement, and this statement is using
> +      view/routine (note that we have not been able to provoke this, maybe
> +      this is impossible). If it happens it is suspicious.
> +
> +      We disable I_S output. And we cannot do otherwise: we have no place to
> +      store a possible "missing privilege" information (no Opt_trace_stmt, as
> +      is_started() is false), so cannot do security checks, so cannot safely
> +      do tracing, so have to disable I_S output. And even then, we don't know
> +      when to re-enable I_S output, as we have no place to store the
> +      information "re-enable tracing at the end of this statement", and we
> +      don't even have a notion of statement here (statements in the optimizer
> +      trace world mean an Opt_trace_stmt object, and there is none here). So
> +      we must disable for the session's life.
> +
> +      COM_FIELD_LIST opens views, thus used to be a case of (3). To avoid
> +      disabling I_S output for the session's life when this command is issued
> +      (like in: "SET OPTIMIZER_TRACE='ENABLED=ON';USE somedb;" in the 'mysql'
> +      command-line client), we have decided to create a Opt_trace_start for
> +      this command. The command itself is not traced though
> +      (SQLCOM_SHOW_FIELDS does not have CF_OPTIMIZER_TRACE).
> +    */
> +    return;
> +  }
As far as I understand, this was copied from MySQL? Is this still true in
MariaDB codebase? It sounds scary, let's discuss it.

> diff -urpN '--exclude=.*' 10.4-optimizer-trace-orig/sql/my_json_writer.cc 10.4-optimizer-trace-cl/sql/my_json_writer.cc
> --- 10.4-optimizer-trace-orig/sql/my_json_writer.cc	2019-02-10 14:25:59.201015244 +0300
> +++ 10.4-optimizer-trace-cl/sql/my_json_writer.cc	2019-02-12 14:04:05.438571272 +0300
> ...
> +Json_writer_array::Json_writer_array(THD *thd) :
> +  Json_writer_struct(thd)

Json_writer_object and Json_writer_array are widely used.  I guess
performance-wise, it is be beneficial if their calls are inlined.

A while ago, compiler would only inline a function if its body is in the .h
file (and so is visible in the compilation unit where inlining should occur).

I'm wondering if this is still true? (if yes, we better move these methods to
the .h file).

Can you look at release-build disassembly and check if there are calls to 
Json_writer_array member functions there or they were inlined?

(the same question for THD::trace_started)
> +{
> +  if (my_writer)
> +    my_writer->start_array();
> +}
> +
> +Json_writer_array::Json_writer_array(THD *thd, const char *str)
> +                                       :Json_writer_struct(thd)
Please change the indentation to be like in the ctor above.
> +{
> +  if (my_writer)

...

> @@ -123,9 +209,18 @@ public:
> +  /*
> +    One can set a limit of how large a JSON document should be.
> +    Writes beyond that size will be counted, but will not be collected.
> +  */
> +  void set_size_limit(size_t mem_size) { output.set_size_limit(mem_size); }
> +
> +  // psergey: return how many bytes would be required to store everything
Remove the comment.

> +  size_t get_truncated_bytes() { return output.get_truncated_bytes(); }
> +
>    Json_writer() : 
>      indent_level(0), document_start(true), element_started(false), 
> -    first_child(true)
> +    first_child(true), allowed_mem_size(0)
>    {
>      fmt_helper.init(this);
>    }
> @@ -140,15 +235,335 @@ private:
>    bool element_started;
>    bool first_child;
>  
> +  /*
> +    True when we are using the optimizer trace
> +    FALSE otherwise
> +  */
> +  size_t allowed_mem_size;
The comment above is wrong, remove it.
Also remove the 'allowed_mem_size' as it is not used anymore.


> +void Opt_trace_context::flush_optimizer_trace()
> +{
> +  inited= false;
> +  if (traces)
> +  {
> +    while (traces->elements())
> +    {
> +      Opt_trace_stmt *prev= traces->at(0);
> +      delete prev;
> +      traces->del(0);
> +    }
> +    delete traces;
> +    traces= NULL;
> +  }
> +}
The name of this function is confusing. It is not clear what is flushed and
where.  How about renaming it to "delete_traces()" ?


> +Opt_trace_context::~Opt_trace_context()
> +{
> +  inited= FALSE;
> +  /*
> +    would be nice to move this to a function
> +  */
And this function is already there - it's currently called flush_optimizer_trace(). 

I am also confused with setting member variables (inited, max_mem_size) in the destructor. 
This object is going away, why do it? Please remove that.

> +  if (traces)
> +  {
> +    while (traces->elements())
> +    {
> +      Opt_trace_stmt *prev= traces->at(0);
> +      delete prev;
> +      traces->del(0);
> +    }
> +    delete traces;
> +    traces= NULL;
> +  }
> +  max_mem_size= 0;
> +}
> +


> --- 10.4-optimizer-trace-orig/sql/opt_range.cc	2019-02-10 14:25:59.209015304 +0300
> +++ 10.4-optimizer-trace-cl/sql/opt_range.cc	2019-02-12 14:04:05.438571272 +0300
> @@ -3214,17 +3431,25 @@ bool calculate_cond_selectivity_for_tabl
>        SEL_ARG *key= tree->keys[idx];
>        if (key)
>        {
...
>            rows= records_in_column_ranges(&param, idx, key);
>            if (rows != DBL_MAX)
> +          {
>              key->field->cond_selectivity= rows/table_records;
> +            selectivity_for_column.add("selectivity_from_histograms",
> +                                       key->field->cond_selectivity);
> +          }
That's not from multiple histogramS, that's from one histogram. Please change
the name.

And I think I have requested this already: at the end of the function, please
print the table->cond_selectivity as the total selectivity of condition for
this table.

...

Coding style: can you search through your patch for instances of 

  if (x) { 
  }

and change them to
  if (x)
  {
  }
?

The same for "x = y;" -> "x= y;"

> --- 10.4-optimizer-trace-orig/sql/opt_trace_context.h	1970-01-01 03:00:00.000000000 +0300
> +++ 10.4-optimizer-trace-cl/sql/opt_trace_context.h	2019-02-10 14:25:38.576860203 +0300
> +  Opt_trace_stmt *current_trace;
> +  /*
> +    TRUE: if we allocate memory for list of traces
> +    FALSE: otherwise
> +  */
> +  bool inited;

'inited' is not necessary, it would be sufficient to check if 'traces' is
NULL. (On the other hand, why traces is a pointer?)

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog