linux-traipu team mailing list archive

Thread
Date

[Bug 941176] Re: slave error doesn't change sys_replication.applier_state

To: linux-traipu@xxxxxxxxxxxxxxxxxxx
From: David Shrewsbury <941176@xxxxxxxxxxxxxxxxxx>
Date: Mon, 05 Mar 2012 12:57:47 -0000
Reply-to: Bug 941176 <941176@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

** Changed in: drizzle
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of UBUNTU -
AL - BR, which is subscribed to Drizzle.
https://bugs.launchpad.net/bugs/941176

Title:
  slave error doesn't change sys_replication.applier_state

Status in A Lightweight SQL Database for Cloud Infrastructure and Web Applications:
  Fix Released

Bug description:
  sys_replication.applier_state is not always correct; it can show
  RUNNING even though a slave error has broken replication.

  To reproduce:

  1. Start a master and create a schema called "crash".

  2. Start a slave with max-commit-id set to the tx id of that even on
  the master, so the slave does *not* create the crash schema.

  3. DROP SCHEMA crash; on the master.

  Replication on the slave will break.  Its error log shows:

  (SQLSTATE 00000) Can't drop schema 'crash'; schema doesn't exist
  Failure while executing:
  COMMIT
  DROP SCHEMA `crash`
  UPDATE `sys_replication`.`applier_state` SET `last_applied_commit_id` = 12, `originating_server_uuid` = '9908C6AA-A982-4763-B9BA-4EF5F933D219' , `originating_commit_id` = 12 WHERE `master_id` = 1

  But sys_replication.applier_state implies that replication is ok:

  drizzle> select * from sys_replication.applier_state\G
  *************************** 1. row ***************************
                master_id: 1
   last_applied_commit_id: 12
  originating_server_uuid: 9908C6AA-A982-4763-B9BA-4EF5F933D219
    originating_commit_id: 12
                   status: RUNNING
                error_msg:

  More than ok, it implies that it actually applied tx id 12, the one
  that caused the error.  This tx is still in the queue:

  drizzle> select * from sys_replication.queue\G
  *************************** 1. row ***************************
                   trx_id: 925
                   seg_id: 1
             commit_order: 12
  originating_server_uuid: 9908C6AA-A982-4763-B9BA-4EF5F933D219
    originating_commit_id: 12
                      msg: transaction_context {
    server_id: 1
    transaction_id: 925
    start_timestamp: 1330211976689868
    end_timestamp: 1330211976689874
  }
  statement {
    type: DROP_SCHEMA
    start_timestamp: 1330211976689872
    end_timestamp: 1330211976689873
    drop_schema_statement {
      schema_name: "crash"
    }
  }
  segment_id: 1
  end_segment: true

                master_id: 1

  Suggested fix: when the slave encounters an error, update
  sys_replication.applier_state.  I have seen
  sys_replication.applier_state be updated on an error, but in this case
  it doesn't work.  Perhaps it detects some errors but not others?

  Workaround: delete the offending transactions from
  sys_replication.queue and restart the slave.

To manage notifications about this bug go to:
https://bugs.launchpad.net/drizzle/+bug/941176/+subscriptions

References

[Bug 941176] [NEW] slave error doesn't change sys_replication.applier_state
From: Daniel Nichter, 2012-02-26