← Back to team overview

maria-discuss team mailing list archive

Re: Doubts about Thread Pool

 

Hi Michael!

2013/9/16 Michael Paulini <michael.paulini@xxxxxxxxxxxx>

> Hi Roberto,
>
> I do believe the idea of the thread pool was to get rid of the one
> thread/connection paradigm, so all connections will be served by
> potentially all threads.
>

Yes the idea of thread pool from
https://mariadb.com/kb/en/thread-pool-in-mariadb-51-53/ and
https://mariadb.com/kb/en/threadpool-in-55/ is ok, handle many connections
in less threads, and use better hardware resources (with some problems with
meta data locks, deadlocks, and others locks)
but let me explain my doubts...
*1) *what's the name of this "task selector", internally (and at worklog
246) i see a scheduler.cc file, should i call this as threads scheduler?
from that file there're three kinds o schedulers "one-process-per-thread"
and "pool-of-threads" and "no-thread" right?
i think that's the right name to this part of code that select what query
will "work" is "scheduler", isn't? i will use it in this email

*2) *what's the internal or maybe the complete name of ID column of
processlist?
internally i know it's the "->thread_id" of THD class variable, from
sql_show.cc:

      /* ID */
      table->field[0]->store((longlong) tmp->thread_id, TRUE);

i know that processlist is a very old code, maybe from mysql 3.23 and at
that time we had only threads, and no thread pool
now we have three schedulers and maybe the ->thread_id is a old var name,
that should be called "connection_id"
but changing this name is a big patch without rewards, and a bad reward of
incompatibility with plugins and others external tools that use THD class,
i'm right?

if yes, maybe we could add more information at
information_schema.processlist, with a comment about the real name, just to
remove the wrong idea about "ID", something like:

CREATE TABLE `PROCESSLIST` (`ID` BIGINT(4) NOT NULL DEFAULT '0'*
COMMENT "INTERNAL CONNECTION ID or something better?"*,`QUERY_ID`
BIGINT(4) NOT NULL DEFAULT '0',`USER` VARCHAR(128) NOT NULL DEFAULT
'',`HOST` VARCHAR(64) NOT NULL DEFAULT '',`DB` VARCHAR(64) NULL
DEFAULT NULL,`COMMAND` VARCHAR(16) NOT NULL DEFAULT '',`TIME` INT(7)
NOT NULL DEFAULT '0',`STATE` VARCHAR(64) NULL DEFAULT NULL,`INFO`
LONGTEXT NULL,`TIME_MS` DECIMAL(22,3) NOT NULL DEFAULT '0.000',`STAGE`
TINYINT(2) NOT NULL DEFAULT '0',`MAX_STAGE` TINYINT(2) NOT NULL
DEFAULT '0',`PROGRESS` DECIMAL(7,3) NOT NULL DEFAULT
'0.000',`MEMORY_USED` INT(7) NOT NULL DEFAULT '0',`EXAMINED_ROWS`
INT(7) NOT NULL DEFAULT '0')COLLATE='utf8_general_ci'ENGINE=Aria;


*3)* now that ID is better explained in processlist, could we show more
information?
i don't know if it's the right place (processlist table), maybe another
table (thread_pool table) is better, check what i'm talking about...

i need information about thread pool group, from "Threadpool implementation
on Unix." worklog (maybe the windows implementatition too, but let's check
unix for now), and see what each thread pool is doing

from worklog, we have this information (i will mark what i think important):

Each *group** *in itself is a complete small pool with a *listener
thread* (the one waiting for network events) , work queue (is it a
thread or a shared memory?) and worker threads (see the "s"? threads,
more than one worker thread).

A group has the responsibility of keeping one thread running, if there
is a work to be done. More than one thread in a group can be running,
depending on circumstances (more about this later).
Clients are *assigned to the groups in a round-robin fashion** (here
my doubt, what client is running in what thread pool group?)*. This
will keep (statistically) about the same ratio of clients per group.
Listener and worker roles are dynamically assigned. Listener can
become worker, after waiting for network events; it can pick an event
and handle thus it becoming a worker. Vice versa, once worker is
finished processing a query, it can become listener.

wow *-*! it's a very very interesting code, schedulers are something that i
really like =)
now... what the pool is doing? i'm thinking something similar to a
information schema table to help here, check:

CREATE TABLE information_schema.THREAD_POOL(
  thread_pool_group BIGINT NOT NULL DEFAULT 0,
  thread_id BIGINT NOT NULL DEFAULT 0, (that's not the THD->thread_id
variable, it's the real system thread, maybe a listerner, a worker or a
work queue (is work queue a thread?) )
  work ENUM('listener','worker','queue') NOT NULL DEFAULT 'queue', (this
show what this thread do in this thread pool group)
  connection_id BIGINT DEFAULT 0, (that's the THD->thread_id variable)

others columns? maybe timeouts and others informations from internall
scheduller? maybe timers for windows or others vars... must check the
scheduller code
)


reading mikael post:
http://mikaelronstrom.blogspot.com.br/2011/10/mysql-thread-pool-information-schema.html
i think i'm not wrong creating new tables...


thanks guys


-- 
Roberto Spadim
SPAEmpresarial

Follow ups

References