maria-developers team mailing list archive

Thread
Date
Re: Review request: SHOW EXPLAIN, attempt 2

To: Sergei Golubchik <serg@xxxxxxxxxxxx>
From: Sergei Petrunia <psergey@xxxxxxxxxxxx>
Date: Wed, 8 Aug 2012 21:25:37 +0400
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20120807113733.GA28896@meddwl>
User-agent: Mutt/1.5.18 (2008-05-17)
Hi Sergei,

On Tue, Aug 07, 2012 at 01:37:33PM +0200, Sergei Golubchik wrote:
> On Jul 17, Sergei Petrunia wrote:
> > > > === modified file 'sql/sql_join_cache.cc'
> > > > --- sql/sql_join_cache.cc	2012-03-24 17:21:22 +0000
> > > > +++ sql/sql_join_cache.cc	2012-05-16 20:59:03 +0000
> > > > @@ -2236,7 +2236,7 @@ enum_nested_loop_state JOIN_CACHE::join_
> > > >    
> > > >    while (!(error= join_tab_scan->next()))   
> > > >    {
> > > > -    if (join->thd->killed)
> > > > +    if (join->thd->check_killed())
> > > 
> > > I would recommend to rename thd->killed (for example, to thd->killed_flag)
> > > just to make sure no code uses thd->killed the old way,
> > > and - more importantly - that we won't get new thd->killed uses in
> > > merges from MySQL.  
> > 
> > psergey:
> > I've tried renaming. This required to make changes in 160 places, and was a
> > 1600-line patch:
> > http://s.petrunia.net/scratch/psergey-rename-killed-to-killed_flag.diff
> > 
> > Did you really intend to make such a big change?
> 
> Yes. Looking at this big patch I see that we have lots of places where
> thd->killed is checked directly, instead of using thd->check_killed(),
> that is, there can be long delays where APC requests won't be served
> (despite the fact that KILL works there).
> 
> But if you feel uneasy doing this, let's create a separate jira task for
> "refactor KILL to work via APC" and I will do it.

Filed https://mariadb.atlassian.net/browse/MDEV-442

> > > > === modified file 'sql/sql_lex.h'
> > > > --- sql/sql_lex.h	2012-05-21 13:30:25 +0000
> > > > +++ sql/sql_lex.h	2012-07-10 17:23:00 +0000
> > > > @@ -2344,7 +2357,8 @@ struct LEX: public Query_tables_list
> > > >    char *backup_dir;				/* For RESTORE/BACKUP */
> > > >    char* to_log;                                 /* For PURGE MASTER LOGS TO */
> > > >    char* x509_subject,*x509_issuer,*ssl_cipher;
> > > > -  String *wild;
> > > > +  String *wild; /* Wildcard in SHOW {something} LIKE 'wild'*/ 
> > > > +  Item *show_explain_for_thread; /* id in SHOW EXPLAIN FOR id */
> > > 
> > > Why do extend LEX even more?
> > > KILL, for example, uses LEX::value_list
> > > 
> > psergey:
> > KILL does NOT use INFORMATION_SCHEMA. This is why it is able to use
> > LEX::value_list.
> 
> Could you please explain, why using INFORMATION_SCHEMA prevents you
> from using LEX::value_list ?
>
Ok, we've figured there's no valid reason. Started to use LEX::value_list.

> > > > === modified file 'sql/sql_parse.cc'
> > > > --- sql/sql_parse.cc	2012-05-21 18:54:41 +0000
> > > > +++ sql/sql_parse.cc	2012-07-10 17:23:00 +0000
> > > > @@ -2143,6 +2144,32 @@ mysql_execute_command(THD *thd)
> > > >      execute_show_status(thd, all_tables);
> > > >      break;
> > > >    }
> > > > +  case SQLCOM_SHOW_EXPLAIN:
> > > > +  {
> > > > +    if (!thd->security_ctx->priv_user[0] &&
> > > > +        check_global_access(thd,PROCESS_ACL))
> > > > +      break;
> > > > +
> > > > +    /*
> > > > +      The select should use only one table, it's the SHOW EXPLAIN pseudo-table
> > > > +    */
> > > > +    if (lex->sroutines.records || lex->query_tables->next_global)
> > > > +    {
> > > > +      my_error(ER_NOT_SUPPORTED_YET, MYF(0), "Usage of subqueries or stored "
> > > > +               "function calls as part of this statement");
> > > 
> > > Ok. Here I'd suggest to create a new error message
> > > 
> > > ER_MSG_SUBQUERIES_OR_ROUTINES
> > >   "Usage of subqueries or stored function calls as part of this statement"
> > > 
> > > and use it as
> > > 
> > >    my_error(ER_NOT_SUPPORTED_YET, MYF(0), ER(ER_MSG_SUBQUERIES_OR_ROUTINES));
> > > 
> > > also in SQLCOM_KILL and SQLCOM_CREATE_EVENT cases.
> > > 
> > psergey:
> > I've changed SHOW EXPLAIN code not to use this error (it uses
> > ER_SET_CONSTANTS_ONLY now). Is that ok?
> 
> Yes, thanks.
> 
> > > > === modified file 'sql/sql_select.cc'
> > > > --- sql/sql_select.cc	2012-06-04 15:26:11 +0000
> > > > +++ sql/sql_select.cc	2012-07-05 20:28:30 +0000
> > > > @@ -928,6 +979,13 @@ bool JOIN::prepare_stage2()
> > > >  }
> > > >  
> > > >  
> > > > +int JOIN::optimize()
> > > > +{
> > > > +  int res= optimize_inner();
> > > > +  if (!res)
> > > > +    have_query_plan= QEP_AVAILABLE;
> > > 
> > > Why wouldn't you simply set have_query_plan from the old JOIN::optimize,
> > > I mean, why did you have to move it to optimize_inner() ?
> > > 
> > > I thought it's because in some cases you need to invoke optimize_inner()
> > > directly, without setting have_query_plan. But you don't seem to do it
> > > anywhere.
> > > 
> > psergey:
> > There are multiple exits from JOIN::optimize, and I wanted to cover them all
> > with one statement.
> 
> I wouldn't do it that way, but it's your task.
> Okay.
> 
> > > > === modified file 'sql/sql_show.cc'
> > > > --- sql/sql_show.cc	2012-05-21 18:54:41 +0000
> > > > +++ sql/sql_show.cc	2012-07-10 17:23:00 +0000
> > > > @@ -1998,6 +1998,124 @@ void mysqld_list_processes(THD *thd,cons
> > > >    DBUG_VOID_RETURN;
> > > >  }
> > > >  
> > > > +
> > > > +static 
> > > > +const char *target_not_explainable_cmd="Target is not running EXPLAINable command";
> > > > +
> > > > +/*
> > > > +  Store the SHOW EXPLAIN output in the temporary table.
> > > > +*/
> > > > +
> > > > +int fill_show_explain(THD *thd, TABLE_LIST *table, COND *cond)
> > > > +{
> > > > +  const char *calling_user;
> > > > +  THD *tmp;
> > > > +  my_thread_id  thread_id;
> > > > +  DBUG_ENTER("fill_show_explain");
> > > > +
> > > > +  DBUG_ASSERT(cond==NULL);
> > > > +  thread_id= thd->lex->show_explain_for_thread->val_int();
> > > > +  calling_user= (thd->security_ctx->master_access & PROCESS_ACL) ?  NullS :
> > > > +                 thd->security_ctx->priv_user;
> > > > +  /* 
> > > > +    Find the thread we need EXPLAIN for. Thread search code was copied from
> > > > +    kill_one_thread()
> > > 
> > > Perhaps, it'd make sense to extract this code to a separate function and
> > > reuse it?
> > >
> > psergey:
> > It's a 13-line loop, I don't think it makes sense to extract it into a
> > separate function. The separate function will have a complex calling
> > convention (it returns with acquired mutex), so I will end up adding 
> > more code than I have removed.
> 
> I've just tried it in your tree (5.5-show-explain).
> 26 lines added, 34 lines removed. It adds less code than it removes :)
> But more importantly, if something will need to be changed in this loop,
> only one place will need to be changed.
> 
> Here's the function that I've used (copied from kill_one_thread):
> 
>   THD *find_thread_by_id(ulong id)
>   {
>     THD *tmp;
>     mysql_mutex_lock(&LOCK_thread_count); // For unlink from list
>     I_List_iterator<THD> it(threads);
>     while ((tmp=it++))
>     {
>       if (tmp->command == COM_DAEMON)
>         continue;
>       if (tmp->thread_id == id)
>       {
>         mysql_mutex_lock(&tmp->LOCK_thd_data);    // Lock from delete
>         break;
>       }
>     }
>     mysql_mutex_unlock(&LOCK_thread_count);
>     return tmp;
>   }
>

Ok, thanks. I have put the function in.

> > > > +  */
> > > > +  mysql_mutex_lock(&LOCK_thread_count); // For unlink from list
> > > > +  I_List_iterator<THD> it(threads);
> > > > +  while ((tmp=it++))
> > > > +  {
> > > > === added file 'sql/my_apc.cc'
> > > > --- sql/my_apc.cc	1970-01-01 00:00:00 +0000
> > > > +++ sql/my_apc.cc	2012-07-10 17:23:00 +0000
> ...
> > > > +/* 
> > > > +  Enter ther state where the target is available for serving APC requests
> > > > +*/
> > > > +void Apc_target::enable()
> > > > +{
> > > > +  /* Ok to do without getting/releasing the mutex: */
> > > > +  enabled++;
> > > > +}
> > > 
> > > My comment in the first review was
> > > 
> > >   2. I don't see why you need enable/disable calls at all.
> > >   You use that to disable SHOW EXPLAIN for certain parts of the code,
> > >   but this is wrong. Apc is not SHOW EXPLAIN, you should not disable the
> > >   whole apc mechanism only because SHOW EXPLAIN cannot be served.
> > >   Instead the show-explain-generating callback function should
> > >   detect when it can and cannot be used and react appropriately.
> > > 
> > > And your reply was
> > > 
> > >   (as discussed on the phone) will try to
> > >   - get the APC mechanism to work always, including ability to wake up
> > >   threads that are sleeping/waiting.
> > >   - EXPLAIN will check whether the APC is available in the context of the
> > >   target thread.
> > 
> > psergey:
> > But haven't we later come to conclusion that it is not currently
> > possible to implement? 
> ...
> > Hence, we have decided not to implement always-wake-up system.
> 
> My recollection is that we agreed to ignore this limitation for now.
> That is, in case of a THD waiting for network and a thread-pool
> enabled, SHOW EXPLAIN will time out, but as SHOW EXPLAIN is useless
> anyway for connections in the "idle" state, it's not a big deal.
> When we'll have more APC users we'll get back to it.
> 
> But again, if you'd like, let's keep it your way, with enabled/disabled.
> And move this change to the "refactor KILL to work via APC" task, that I
> mentioned above.

Done, added to task.

> > Your review is trying to pull stuff in different directions. On one
> > hand, INFORMATION_SCHEMA support is added for querying EXPLAINs from
> > many threads.  On the other hand, you want APC system to do the basic
> > check of whether EXPLAIN can be produced in the context of the target
> > thread. This way,
> >   
> >   SELECT * FROM information_schema.explain
> > 
> > will need to switch context to every thread on the server. Does this
> > make any sense?
> 
> But you didn't create the I_S table, right? I mean, it's marked as
> hidden and can not be selected from, only SHOW command works?
> 
Yes. The table is currently marked as hidden.

> I was contemplating that when we need to send APC requests to many
> threads at once, we could do it in parallel - post all requests, then
> wait for all of them to complete. But there's no need to do it now -
> it's a future extension, we'll consider it when we'll have a use case.
> 
> > > > === added file 'sql/my_apc.h'
> > > > --- sql/my_apc.h	1970-01-01 00:00:00 +0000
> > > > +++ sql/my_apc.h	2012-07-11 09:39:56 +0000
> > > > @@ -0,0 +1,134 @@
> ... 
> > I'm sorry, I do not understand the point of "Why did you not do X"
> > questions.  There are nearly infinitely many things I could have done,
> > but I needed to produce exactly one patch, and that required me to
> > stop at some variant.  I stopped at a C-type function call. Per your
> > request, I changed it to a C++ functor. What else do you need?
> > 
> > Does having four parameters to make_apc_call() function cause any
> > defect? If yes, could you please explicitly mention it?
> 
> I'm just trying to understand your logic, when it is not obvious from
> the code. If I'd wanted you to change something, I'd wrote "please
> change this". But here I'm merely asking, I'm not trying to imply that
> there are defects of that some other way of doing this piece of code
> would be better. No hidden meaning :)
> 

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
References

Review request: SHOW EXPLAIN
From: Sergei Petrunia, 2012-06-19
Review request: SHOW EXPLAIN, attempt 2
From: Sergei Petrunia, 2012-07-11
Re: Review request: SHOW EXPLAIN, attempt 2
From: Sergei Golubchik, 2012-07-13
Re: Review request: SHOW EXPLAIN, attempt 2
From: Sergei Petrunia, 2012-07-17
Re: Review request: SHOW EXPLAIN, attempt 2
From: Sergei Golubchik, 2012-08-07