← Back to team overview

maria-developers team mailing list archive

Re: MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool

 

 
 
From: Roberto Spadim [mailto:roberto@xxxxxxxxxxxxx] 
Sent: Freitag, 20. September 2013 21:12
To: Vladislav Vaintroub
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Maria-developers] MDEV-5019 - THREADPOOL - Create Information
Schema Table for Threadpool
 
Hi!
Some read/study i'm doing about thread pool to understand what could be
exposed with information_schema tables...
check if i'm grouping ("creating tables") with the right information...
i'm reading threadgroup_unix.cc and others threadgroup* files to start this
work
 
I strongly suggest reading the worklog, before jumping straight to the code.
It is not trivial, so reading first can save time.
http://worklog.askmonty.org/worklog/Server-BackLog/?tid=246 
 
-----
from your mail Vladislav:
"Global all_groups array contains  all thread groups ."
 
=>      static thread_group_t all_groups[MAX_THREAD_GROUPS];
=>      static uint group_count;  
                  "group_count" is the max value of a loop using all_groups
var?
                  something realted to "threadpool_max_threads", with a
limit of group_count<MAX_THREAD_GROUPS?
          
"Every thread_group_t has list of 
          waiting threads , called "waiting_threads",  
          and queue of not yet handled requests, called "queue" 
          (request is represented by connection_t ), 
          a listener  etc."
a request is a "php mysql_connect"? each new tcp/ip connection create a new
request?
 
Usually, request is an SQL query . More strictly , request in this context
is a network packet from client (it can be an SQL query, QUIT packet that
informs server that connection is about to be terminated, one of the
handshake packets during connection establishment, etc)
 
 
 
=>      struct thread_group_t
          {
           mysql_mutex_t mutex;
           connection_queue_t queue;
           worker_list_t waiting_threads;
           worker_thread_t *listener;       (listener thread? maybe it have
a THD and we could show the query_id/thread_id or some information?)
No it does not have THD. It is an OS thread waiting for network events.
 
           pthread_attr_t *pthread_attr;   (what is pthread_attr? i didn't
found in threadpool*)
It is not important for the discussion. You can find it in Unix man pages
 
           int  pollfd;                             (a fd to kevent and
others libs?)
Yes. Kevent, epoll, etc all have a special file descriptor. Listener thread
waits on it.
 
           int  thread_count;                  (number of threads in this
thread group?!)
Yes
           int  active_thread_count;        (active threads running this
thread group?!)
Yes
           int  connection_count;           (active connections in this
thread group?!)
No, all connections, idle or active. Connection is bound to thread group.
 
           int io_event_count;                (io event count? what is this?
network io?)
Yes. It is used to avoid stalls. Pleae look in the code how it is used
 
            int queue_event_count;          (queue event count? maybe the
COUNT(*) for connection_queue_t queue ?)
Also used to avoid stalls. I is number of connections that were explicitly
added into the "queue". This is something that only happens during connect
phase, polling thread (dedicated MySQL thread, that only handles new
connections) adds a new connection to the queue.
 
           ulonglong last_thread_creation_time;   (last thread create time,
it's a unixtimestamp * 1.000.000 (us)? )
Internal statistics, look how it is used. The idea is not to create too many
threads too quickly, and last thread creation time tells you if you're
creating threads too quickly.
 
           int  shutdown_pipe[2];           (maybe rpc to call a server
shutdown?)
This is a pipe, to wake listener thread for shut down. 
 
           bool shutdown;                     (shutdown information?)
true if group is shutdown
 
           bool stalled;                         (hum... stall (nice name),
well i must study about threadpool stall yet, but i think it's a nice
information to report to DBA =] )
used to determine stalls.
           
          } MY_ALIGNED(512);
 
 
 
I don't know yet how   I_P_List<> and I_P_List_adapter<> work, but i will
search about it in code... it's like a C++ object with ->legth and others
easy to use tools? something like "foreach" (connection_queue_t as xxx) and
interact with each connection_t inside connection_queue_t using xxx?
 
You can traverse the list with an iterator. Something like 
 
connection_queue_t::Iterator it(group->queue);
connection_t *con;
while((con= it++)) {
   // use con somehow, e.g con->thd
}
 
 
 
          typedef I_P_List<connection_t,
                     I_P_List_adapter<connection_t,
                                      &connection_t::next_in_queue,
                                      &connection_t::prev_in_queue>,
                     I_P_List_null_counter,
                     I_P_List_fast_push_back<connection_t> >
=>      connection_queue_t;
          
          typedef I_P_List<worker_thread_t,
I_P_List_adapter<worker_thread_t,
                 &worker_thread_t::next_in_list,
                 &worker_thread_t::prev_in_list> 
                 >
=>      worker_list_t;
 
---
=>      struct worker_thread_t
          {
           ulonglong  event_count; /* number of request handled by this
thread */
           thread_group_t* thread_group;   (it's point to thread_group
inside all_groups? does it have an ID about what index of all_groups we
are?)
you can calculate its offset in the all_groups array. 
e.g ( (thread_group - all_groups)/sizeof(thread_group_t))
 
           worker_thread_t *next_in_list;
           worker_thread_t **prev_in_list;
           
           mysql_cond_t  cond;  (what is this?)
condition variable. A waiting thread waits on condition variable
           bool          woken;       (what is this?)
avoiding spurious wakeups, so if thread wakes up and "woken" is set, then it
is really woken
 
          };
 
=>      struct connection_t
          {
 
           THD *thd;
           thread_group_t *thread_group;
           connection_t *next_in_queue;
           connection_t **prev_in_queue;
           ulonglong abs_wait_timeout;    (what is this?)
 
Looks how wait_timeouts is handled- it needs special handling in
threadpools. Here, there is a timer thread, that periodically wakes up, or
wakes up when first query timer expires. Then, all connections are examined
whether query timeout has expired. If so, the "expired" connection is shut
down.
 
           bool logged_in;     (hummm waiting password?)
Sorta, waiting for the handshake response from client.
 
           bool bound_to_poll_descriptor;    (what is this?)
Internal stuff, ignore
 
           bool waiting;                              (connection in waiting
queue state?)
Waiting for some internal mutex (row lock, table lock, stuff like that). Or
inside SELECT SLEEP(N)
          };
 
=>      pthread_attr_t ??? (didn't found in threadpool* files, i will search
with time about it, maybe something to linux/unix pthread lib?)
 
Not interesting in our discussion. It is OS structure, opaque to its users.
Used to set thread stack size.
          
============================================================================
===============================
 
well now i'm thinking right about information_schema tables using the
variables from the top of this email...
<skip>
Here is what I could think of 
 
Threadpool_Threads (combined from all waiting_lists in groups and "thread"
list):  
 thread id , group id,THD id (0 if currently idle), event_count,
is_listener, is_waiting
(we do not store OS thread id btw, because there was no need, you could use
address of worker_thread_t struct at least temporarily)
 
Threadpool_pending_requests (combined from all "queue" lists in groups): THD
id, group id
 
Threadpool_Group_info (from thread_group_t struct) : group id, thread_count,
active_thread_count, connection count, microseconds since last thread
creation
 
I do not think  there is much more interesting info to show. 
 
something that i didn't understand yet...
for example... if i got a waiting_thread, how i know what THD or what
'worker' thread will "do the job" of this query? i know it's too fast that i
will not get it with information_schema, but it's a deterministic function
right? 
No, you do not know that for sure. The logic is rather complicated, and you
should read the worklog to understand how it works. But threads are woken in
LIFO order, and requests are processed in FIFO order, so you can imagine
that the last thread in waiting_threads will be woken and take the first
request from queue. More often than not though, listener thread will do the
job
 
 
about the thread pool work... if the query is attached to one thread_group,
it could 'jump' to another thread group? the "worker thread" is outside from
thread group and at end of the execution it will realloc the connection to
another thread_group?
 
Connection does not usually change the group, but it can when someone
changes thread_pool_size  online. The connection with id N belongs to the
group N%thread_group_size
 
if anyone have more ideas about this information tables, and where i could
get information about each column, please reply with the answer or with
ideas =)
thanks guys!

Follow ups

References