← Back to team overview

maria-developers team mailing list archive

Re: crash in TC_LOG_MMAP::log_one_transaction in maria-10.0.0

 

Rich Prohaska <prohaska7@xxxxxxxxx> writes:

> I made a 10.0.0 branch on launchpad with tokudb in it.  The following
> can be used to hit the assert in TC_LOG_MMAP::log_one_transaction.  If

With the test case you supplied, it was easy to repeat the problem, thanks a
lot!

The attached patch fixes the issue for me, please try it and let me know if it
solves the problem for you also.

This bug (two bugs actually) is in all versions of MariaDB. Perhaps it is only
possible to trigger it in 10.0, or at least easier to trigger, as it depends
on the order in which things happen during commit.

Thanks for finding and reporting this. As you have learned the hard way, this
(multi-engine XA transactions with no binlog) is one of the not-so-well tested
parts of the server, but if you keep the bug reports coming we will do our
best to fix them ASAP.

 - Kristian.

=== modified file 'sql/log.cc'
--- sql/log.cc	2012-11-03 11:28:51 +0000
+++ sql/log.cc	2012-11-17 13:35:38 +0000
@@ -7398,8 +7398,9 @@ int TC_LOG_MMAP::open(const char *opt_na
 
   syncing= 0;
   active=pages;
+  DBUG_ASSERT(npages >= 2);
   pool=pages+1;
-  pool_last=pages+npages-1;
+  pool_last_ptr= &((pages+npages-1)->next);
   commit_ordered_queue= NULL;
   commit_ordered_queue_busy= false;
 
@@ -7432,8 +7433,8 @@ void TC_LOG_MMAP::get_active_from_pool()
   do
   {
     best_p= p= &pool;
-    if ((*p)->waiters == 0) // can the first page be used ?
-      break;                // yes - take it.
+    if ((*p)->waiters == 0 && (*p)->free > 0) // can the first page be used ?
+      break;                                  // yes - take it.
 
     best_free=0;            // no - trying second strategy
     for (p=&(*p)->next; *p; p=&(*p)->next)
@@ -7450,10 +7451,10 @@ void TC_LOG_MMAP::get_active_from_pool()
   mysql_mutex_assert_owner(&LOCK_active);
   active=*best_p;
 
-  if ((*best_p)->next)              // unlink the page from the pool
-    *best_p=(*best_p)->next;
-  else
-    pool_last=*best_p;
+  /* Unlink the page from the pool. */
+  if (!(*best_p)->next)
+    pool_last_ptr= best_p;
+  *best_p=(*best_p)->next;
   mysql_mutex_unlock(&LOCK_pool);
 
   mysql_mutex_lock(&active->lock);
@@ -7617,8 +7618,8 @@ int TC_LOG_MMAP::sync()
 
   /* page is synced. let's move it to the pool */
   mysql_mutex_lock(&LOCK_pool);
-  pool_last->next=syncing;
-  pool_last=syncing;
+  (*pool_last_ptr)=syncing;
+  pool_last_ptr=&(syncing->next);
   syncing->next=0;
   syncing->state= err ? PS_ERROR : PS_POOL;
   mysql_cond_signal(&COND_pool);           // in case somebody's waiting

=== modified file 'sql/log.h'
--- sql/log.h	2012-09-30 23:30:44 +0000
+++ sql/log.h	2012-11-16 12:40:36 +0000
@@ -145,7 +145,7 @@ class TC_LOG_MMAP: public TC_LOG
   my_off_t file_length;
   uint npages, inited;
   uchar *data;
-  struct st_page *pages, *syncing, *active, *pool, *pool_last;
+  struct st_page *pages, *syncing, *active, *pool, **pool_last_ptr;
   /*
     note that, e.g. LOCK_active is only used to protect
     'active' pointer, to protect the content of the active page


Follow ups

References