maria-developers team mailing list archive

Thread
Date
Re: Review request: SHOW EXPLAIN, attempt 2

To: Sergei Petrunia <psergey@xxxxxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxxx>
Date: Tue, 7 Aug 2012 13:37:33 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20120717192852.GC6532@pslp2>
User-agent: Mutt/1.5.21 (2010-09-15)
Hi, Sergei!

On Jul 17, Sergei Petrunia wrote:
> > > === modified file 'sql/sql_join_cache.cc'
> > > --- sql/sql_join_cache.cc	2012-03-24 17:21:22 +0000
> > > +++ sql/sql_join_cache.cc	2012-05-16 20:59:03 +0000
> > > @@ -2236,7 +2236,7 @@ enum_nested_loop_state JOIN_CACHE::join_
> > >    
> > >    while (!(error= join_tab_scan->next()))   
> > >    {
> > > -    if (join->thd->killed)
> > > +    if (join->thd->check_killed())
> > 
> > I would recommend to rename thd->killed (for example, to thd->killed_flag)
> > just to make sure no code uses thd->killed the old way,
> > and - more importantly - that we won't get new thd->killed uses in
> > merges from MySQL.  
> 
> psergey:
> I've tried renaming. This required to make changes in 160 places, and was a
> 1600-line patch:
> http://s.petrunia.net/scratch/psergey-rename-killed-to-killed_flag.diff
> 
> Did you really intend to make such a big change?

Yes. Looking at this big patch I see that we have lots of places where
thd->killed is checked directly, instead of using thd->check_killed(),
that is, there can be long delays where APC requests won't be served
(despite the fact that KILL works there).

But if you feel uneasy doing this, let's create a separate jira task for
"refactor KILL to work via APC" and I will do it.

> > > === modified file 'sql/sql_lex.h'
> > > --- sql/sql_lex.h	2012-05-21 13:30:25 +0000
> > > +++ sql/sql_lex.h	2012-07-10 17:23:00 +0000
> > > @@ -2344,7 +2357,8 @@ struct LEX: public Query_tables_list
> > >    char *backup_dir;				/* For RESTORE/BACKUP */
> > >    char* to_log;                                 /* For PURGE MASTER LOGS TO */
> > >    char* x509_subject,*x509_issuer,*ssl_cipher;
> > > -  String *wild;
> > > +  String *wild; /* Wildcard in SHOW {something} LIKE 'wild'*/ 
> > > +  Item *show_explain_for_thread; /* id in SHOW EXPLAIN FOR id */
> > 
> > Why do extend LEX even more?
> > KILL, for example, uses LEX::value_list
> > 
> psergey:
> KILL does NOT use INFORMATION_SCHEMA. This is why it is able to use
> LEX::value_list.

Could you please explain, why using INFORMATION_SCHEMA prevents you
from using LEX::value_list ?

> > > === modified file 'sql/sql_parse.cc'
> > > --- sql/sql_parse.cc	2012-05-21 18:54:41 +0000
> > > +++ sql/sql_parse.cc	2012-07-10 17:23:00 +0000
> > > @@ -2143,6 +2144,32 @@ mysql_execute_command(THD *thd)
> > >      execute_show_status(thd, all_tables);
> > >      break;
> > >    }
> > > +  case SQLCOM_SHOW_EXPLAIN:
> > > +  {
> > > +    if (!thd->security_ctx->priv_user[0] &&
> > > +        check_global_access(thd,PROCESS_ACL))
> > > +      break;
> > > +
> > > +    /*
> > > +      The select should use only one table, it's the SHOW EXPLAIN pseudo-table
> > > +    */
> > > +    if (lex->sroutines.records || lex->query_tables->next_global)
> > > +    {
> > > +      my_error(ER_NOT_SUPPORTED_YET, MYF(0), "Usage of subqueries or stored "
> > > +               "function calls as part of this statement");
> > 
> > Ok. Here I'd suggest to create a new error message
> > 
> > ER_MSG_SUBQUERIES_OR_ROUTINES
> >   "Usage of subqueries or stored function calls as part of this statement"
> > 
> > and use it as
> > 
> >    my_error(ER_NOT_SUPPORTED_YET, MYF(0), ER(ER_MSG_SUBQUERIES_OR_ROUTINES));
> > 
> > also in SQLCOM_KILL and SQLCOM_CREATE_EVENT cases.
> > 
> psergey:
> I've changed SHOW EXPLAIN code not to use this error (it uses
> ER_SET_CONSTANTS_ONLY now). Is that ok?

Yes, thanks.

> > > === modified file 'sql/sql_select.cc'
> > > --- sql/sql_select.cc	2012-06-04 15:26:11 +0000
> > > +++ sql/sql_select.cc	2012-07-05 20:28:30 +0000
> > > @@ -928,6 +979,13 @@ bool JOIN::prepare_stage2()
> > >  }
> > >  
> > >  
> > > +int JOIN::optimize()
> > > +{
> > > +  int res= optimize_inner();
> > > +  if (!res)
> > > +    have_query_plan= QEP_AVAILABLE;
> > 
> > Why wouldn't you simply set have_query_plan from the old JOIN::optimize,
> > I mean, why did you have to move it to optimize_inner() ?
> > 
> > I thought it's because in some cases you need to invoke optimize_inner()
> > directly, without setting have_query_plan. But you don't seem to do it
> > anywhere.
> > 
> psergey:
> There are multiple exits from JOIN::optimize, and I wanted to cover them all
> with one statement.

I wouldn't do it that way, but it's your task.
Okay.

> > > === modified file 'sql/sql_show.cc'
> > > --- sql/sql_show.cc	2012-05-21 18:54:41 +0000
> > > +++ sql/sql_show.cc	2012-07-10 17:23:00 +0000
> > > @@ -1998,6 +1998,124 @@ void mysqld_list_processes(THD *thd,cons
> > >    DBUG_VOID_RETURN;
> > >  }
> > >  
> > > +
> > > +static 
> > > +const char *target_not_explainable_cmd="Target is not running EXPLAINable command";
> > > +
> > > +/*
> > > +  Store the SHOW EXPLAIN output in the temporary table.
> > > +*/
> > > +
> > > +int fill_show_explain(THD *thd, TABLE_LIST *table, COND *cond)
> > > +{
> > > +  const char *calling_user;
> > > +  THD *tmp;
> > > +  my_thread_id  thread_id;
> > > +  DBUG_ENTER("fill_show_explain");
> > > +
> > > +  DBUG_ASSERT(cond==NULL);
> > > +  thread_id= thd->lex->show_explain_for_thread->val_int();
> > > +  calling_user= (thd->security_ctx->master_access & PROCESS_ACL) ?  NullS :
> > > +                 thd->security_ctx->priv_user;
> > > +  /* 
> > > +    Find the thread we need EXPLAIN for. Thread search code was copied from
> > > +    kill_one_thread()
> > 
> > Perhaps, it'd make sense to extract this code to a separate function and
> > reuse it?
> >
> psergey:
> It's a 13-line loop, I don't think it makes sense to extract it into a
> separate function. The separate function will have a complex calling
> convention (it returns with acquired mutex), so I will end up adding 
> more code than I have removed.

I've just tried it in your tree (5.5-show-explain).
26 lines added, 34 lines removed. It adds less code than it removes :)
But more importantly, if something will need to be changed in this loop,
only one place will need to be changed.

Here's the function that I've used (copied from kill_one_thread):

  THD *find_thread_by_id(ulong id)
  {
    THD *tmp;
    mysql_mutex_lock(&LOCK_thread_count); // For unlink from list
    I_List_iterator<THD> it(threads);
    while ((tmp=it++))
    {
      if (tmp->command == COM_DAEMON)
        continue;
      if (tmp->thread_id == id)
      {
        mysql_mutex_lock(&tmp->LOCK_thd_data);    // Lock from delete
        break;
      }
    }
    mysql_mutex_unlock(&LOCK_thread_count);
    return tmp;
  }

> > > +  */
> > > +  mysql_mutex_lock(&LOCK_thread_count); // For unlink from list
> > > +  I_List_iterator<THD> it(threads);
> > > +  while ((tmp=it++))
> > > +  {
> > > === added file 'sql/my_apc.cc'
> > > --- sql/my_apc.cc	1970-01-01 00:00:00 +0000
> > > +++ sql/my_apc.cc	2012-07-10 17:23:00 +0000
...
> > > +/* 
> > > +  Enter ther state where the target is available for serving APC requests
> > > +*/
> > > +void Apc_target::enable()
> > > +{
> > > +  /* Ok to do without getting/releasing the mutex: */
> > > +  enabled++;
> > > +}
> > 
> > My comment in the first review was
> > 
> >   2. I don't see why you need enable/disable calls at all.
> >   You use that to disable SHOW EXPLAIN for certain parts of the code,
> >   but this is wrong. Apc is not SHOW EXPLAIN, you should not disable the
> >   whole apc mechanism only because SHOW EXPLAIN cannot be served.
> >   Instead the show-explain-generating callback function should
> >   detect when it can and cannot be used and react appropriately.
> > 
> > And your reply was
> > 
> >   (as discussed on the phone) will try to
> >   - get the APC mechanism to work always, including ability to wake up
> >   threads that are sleeping/waiting.
> >   - EXPLAIN will check whether the APC is available in the context of the
> >   target thread.
> 
> psergey:
> But haven't we later come to conclusion that it is not currently
> possible to implement? 
...
> Hence, we have decided not to implement always-wake-up system.

My recollection is that we agreed to ignore this limitation for now.
That is, in case of a THD waiting for network and a thread-pool
enabled, SHOW EXPLAIN will time out, but as SHOW EXPLAIN is useless
anyway for connections in the "idle" state, it's not a big deal.
When we'll have more APC users we'll get back to it.

But again, if you'd like, let's keep it your way, with enabled/disabled.
And move this change to the "refactor KILL to work via APC" task, that I
mentioned above.

> Your review is trying to pull stuff in different directions. On one
> hand, INFORMATION_SCHEMA support is added for querying EXPLAINs from
> many threads.  On the other hand, you want APC system to do the basic
> check of whether EXPLAIN can be produced in the context of the target
> thread. This way,
>   
>   SELECT * FROM information_schema.explain
> 
> will need to switch context to every thread on the server. Does this
> make any sense?

But you didn't create the I_S table, right? I mean, it's marked as
hidden and can not be selected from, only SHOW command works?

I was contemplating that when we need to send APC requests to many
threads at once, we could do it in parallel - post all requests, then
wait for all of them to complete. But there's no need to do it now -
it's a future extension, we'll consider it when we'll have a use case.

> > > === added file 'sql/my_apc.h'
> > > --- sql/my_apc.h	1970-01-01 00:00:00 +0000
> > > +++ sql/my_apc.h	2012-07-11 09:39:56 +0000
> > > @@ -0,0 +1,134 @@
... 
> I'm sorry, I do not understand the point of "Why did you not do X"
> questions.  There are nearly infinitely many things I could have done,
> but I needed to produce exactly one patch, and that required me to
> stop at some variant.  I stopped at a C-type function call. Per your
> request, I changed it to a C++ functor. What else do you need?
> 
> Does having four parameters to make_apc_call() function cause any
> defect? If yes, could you please explicitly mention it?

I'm just trying to understand your logic, when it is not obvious from
the code. If I'd wanted you to change something, I'd wrote "please
change this". But here I'm merely asking, I'm not trying to imply that
there are defects of that some other way of doing this piece of code
would be better. No hidden meaning :)

Regards,
Sergei
Follow ups

Re: Review request: SHOW EXPLAIN, attempt 2
From: Sergei Petrunia, 2012-08-08
References

Review request: SHOW EXPLAIN
From: Sergei Petrunia, 2012-06-19
Review request: SHOW EXPLAIN, attempt 2
From: Sergei Petrunia, 2012-07-11
Re: Review request: SHOW EXPLAIN, attempt 2
From: Sergei Golubchik, 2012-07-13
Re: Review request: SHOW EXPLAIN, attempt 2
From: Sergei Petrunia, 2012-07-17