← Back to team overview

maria-developers team mailing list archive

Re: Query Stuck in "closing table" State

 

I'm not closing them explicitly.

The process running on the machine does the following (summarized):

Management event:
STOP SLAVE 'slave-name';
FLUSH TABLES replicated_table;
DROP TABLE replicated_table_batch;
ALTER TABLE replicated_table RENAME TO replicated_table_batch;
CREATE TABLE replicated_table LIKE replicated_table_batch;
START SLAVE 'slave-name';

Processing thread (the one that gets stuck):
INSERT INTO accumulating_table
SELECT ... FROM replicated_table_batch ...
ON DUPLICATE KEY UPDATE ...

With a few triggers on accumulating_table running to update
a few other summary tables.

There is a status table that maintains state and makes sure the
replicated_table/replicated_table_batch doesn't get rotated while
the process is running.

The management event NEVER runs while the processing event is
running, so the table flushes it is doing shouldn't be
related (and if it was happening I'd expect to see in in the
show full processlist output, which is not the case).

As an aside, there are interesting side-effects of replication on
transactions when you start renaming tables. If you use an
InnoDB table, stopping the slave stops mid-transaction. After
the table rename, the outcome becomes highly "undefined". I
have not been able to establish if the transaction continues
on the old renamed table or on a new table with the same name
that has been created in the meantime. This throws the data
consistency out of sync if there are multiple processes running
on the rename target table as some transactions appear to arrive
after the rename.

When using a non-transactional table type (e.g. MyISAM) this
doesn't happen - slave thread knows to continue committing until
the end of transaction before returning. Arguably, this should
also happen for transaction-capable tables (or at least the
behaviour should be configurable in this regard. Similar for
ALTER TABLE ... RENAME TO ... - it should probably ensure a
clean break in the transaction commit charge so that pre-rename
transactions don't arrive after the rename has completed.

This may could be considered to be a bug, but perhaps it is
worthy of a discussion WRT what the correct behaviour _should_
be.

As for the variables - any particular variables you're
interested in? The full list is quite large.

Gordan

On Wed, 20 Mar 2013 14:47:29 +0100, Jesper Staun Hansen <jesper@xxxxxxxxxxxxxx> wrote:
Hello Gordan,

What does your system variables look like?
What I am wondering about myself is why you're even closing the tables.

Jesper

On 20-03-2013 14:25, Gordan Bobic wrote:
Just as an additional data point, the query does seem to eventually
complete (and the transaction it is in commits) after about 2 hours.
But 2 hours seems an excessively long time for "closing tables". It
is a simple INSERT ... SELECT ... ON DUPLICATE KEY UPDATE, with a few
triggers on the table being inserted into firing off. The SELECT on
it's own takes only seconds to complete. The UPDATEs fired off by the
triggers complete in milliseconds.

Gordan

On Wed, 20 Mar 2013 12:51:50 +0000, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
On Wed, 20 Mar 2013 13:20:28 +0100, Sergei Golubchik
<serg@xxxxxxxxxxxx> wrote:
Hi, Gordan!

On Mar 20, Gordan Bobic wrote:
> On Mar 20, Gordan Bobic wrote:
>> I have a situation with MariaDB 10.0.0 where an INSERT INTO ...
>>  SELECT query that takes 3 seconds to run ends up stuck in the
>> closing tables state for 15+ minutes while burning 100% of CPU.
>>
 Stack trace attached in a text file.
Does this provide enough information for a worthwhile bug report?

Was this stack trace created when you had your problem with a stuck
query?

Yes. And the same query is still stuck in the same state 35 minutes later.

I have attached a fresh stack trace.

Because it doesn't look like there's any thread stuck in the "closing
table" state.

In which case the state shown in "show full processlist" is incorrect.
Note that I think that helps.

There's one event running, a couple of idle connection threads, a couple of dead connection thread (waiting in the thread cache for resurrection),
and internal service threads, aria, innodb, signal thread, etc.

The one event that is running must be it, then. I specifically disabled everything else on the box (slaves, event schduler) to get a clean trace.

Gordan




_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp



References