← Back to team overview

maria-developers team mailing list archive

Re: [Merge] lp:~paul-mccullagh/maria/maria-pbxt-rc3 into lp:maria

 

Hi Sergei,

I have tested this, and it all works as you specified, but now I have found a problem with PBXT.

The problem stems from code in PBXT (in functions ha_pbxt::write_row, ha_pbxt::update_row, ha_pbxt::delete_row) which I have pasted below. The code is a hack!

According to the comment I had a problem with calling: trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton), to register the start of a transaction statement.

When I did this unconditionally at the start of a statement, ha_commit_trans() was not always called (and MySQL became confused, thinking a transaction is still running). On reading the MySQL code (long time ago) I noticed that ha_commit_trans() was only always called when an update was performed.

So I delayed the trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton) call until write_row(), etc was called.

The problem now is that MySQL now fails to recognize PBXT transactions as XA (see ha_write_row below), because mark_trx_read_write() is called before write_row() (i.e. before trans_register_ha() call).

I noticed that InnoDB and NDB call trans_register_ha() unconditionally at the start of a statement.

Could it be that the problem that lead to my "hack" has been fixed?

Best regards,

Paul


PBXT code in ha_pbxt::write_row, etc.
------------------------------------------------------------
	/* GOTCHA: I have a huge problem with the transaction statement.
	 * It is not ALWAYS committed (I mean ha_commit_trans() is
	 * not always called - for example in SELECT).
	 *
	 * If I call trans_register_ha() but ha_commit_trans() is not called
	 * then MySQL thinks a transaction is still running (while
	 * I have committed the auto-transaction in ha_pbxt::external_lock()).
	 *
	 * This causes all kinds of problems, like transactions
	 * are killed when they should not be.
	 *
	 * To prevent this, I only inform MySQL that a transaction
* has beens started when an update is performed. I have determined that * ha_commit_trans() is only guarenteed to be called if an update is done.
	 */
	if (!pb_open_tab->ot_thread->st_stat_trans) {
		trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton);
XT_PRINT0(pb_open_tab->ot_thread, "ha_pbxt::write_row trans_register_ha all=FALSE\n");
		pb_open_tab->ot_thread->st_stat_trans = TRUE;
	}


-------------------------------
int handler::ha_write_row(uchar *buf)
{
  int error;
Log_func *log_func= Write_rows_log_event::binlog_row_logging_function;
  DBUG_ENTER("handler::ha_write_row");

  mark_trx_read_write();

  if (unlikely(error= write_row(buf)))
    DBUG_RETURN(error);
  if (unlikely(error= binlog_log_row(table, 0, buf, log_func)))
    DBUG_RETURN(error); /* purecov: inspected */
  DBUG_RETURN(0);
}




On Dec 2, 2009, at 12:26 PM, Sergei Golubchik wrote:

Hi, Paul!

On Dec 02, Paul McCullagh wrote:

Now, if a binlog doesn't indicate a crash (last binlog file was
closed properly, showing a normal shutdown procedure) or, for
example, there're no binlog files at all, the server cannot perform
the recovery - it doesn't know what transactions should be committed
and what transactions should be rolled back. A transaction may be
prepared in one engine and already committed (or rolled back) in
another. In this case the server requests the user to make the
decision, the user has to request an explicit commit or rollback with
--tc-heuristic-recover command-line switch.

So if I understand this correctly, tc-heuristic-recover handles a
situation that should actually never occur.

Yes, that's mainly to account for a user mistake - deleting binlog after
a crash or copying the live datadir from master to a slave without
binlog. Or something similar.

But if there's only one XA-capable storage engine, it's always safe to rollback all prepared transactions, there can be no other engine that has them committed. That's what the ifdef-ed code does - if InnoDB is the only XA-capable storage engine on recovery without binlog it forces
a rollback of all not committed transactions, preserving the pre-XA
InnoDB behavior.

So, in fact, we could change this code:

to:

if (total_ha_2pc == (ulong) opt_bin_log+1) {
tc_heuristic_recover= TC_HEURISTIC_RECOVER_ROLLBACK; // forcing ROLLBACK
 info.dry_run=FALSE;
}

I intentionally did it with ifdef - the idea was that if there could
*possibly* be another XA-engine, we need to go the safe way. Otherwise
one could, for example, restart the server with disabled pbxt and
unknowingly enable auto-rollback mode, causing inconsistent data.

Regards / Mit vielen Grüßen,
Sergei

--
  __  ___     ___ ____  __
 /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <serg@xxxxxxx>
/ /|_/ / // /\ \/ /_/ / /__ Principal Software Engineer/Server Architect /_/ /_/\_, /___/\___\_\___/ Sun Microsystems GmbH, HRB München 161028 <___/ Sonnenallee 1, 85551 Kirchheim- Heimstetten
Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Häring



--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com






References