← Back to team overview

linux-traipu team mailing list archive

[Bug 941176] [NEW] slave error doesn't change sys_replication.applier_state

 

Public bug reported:

sys_replication.applier_state is not always correct; it can show RUNNING
even though a slave error has broken replication.

To reproduce:

1. Start a master and create a schema called "crash".

2. Start a slave with max-commit-id set to the tx id of that even on the
master, so the slave does *not* create the crash schema.

3. DROP SCHEMA crash; on the master.

Replication on the slave will break.  Its error log shows:

(SQLSTATE 00000) Can't drop schema 'crash'; schema doesn't exist
Failure while executing:
COMMIT
DROP SCHEMA `crash`
UPDATE `sys_replication`.`applier_state` SET `last_applied_commit_id` = 12, `originating_server_uuid` = '9908C6AA-A982-4763-B9BA-4EF5F933D219' , `originating_commit_id` = 12 WHERE `master_id` = 1

But sys_replication.applier_state implies that replication is ok:

drizzle> select * from sys_replication.applier_state\G
*************************** 1. row ***************************
              master_id: 1
 last_applied_commit_id: 12
originating_server_uuid: 9908C6AA-A982-4763-B9BA-4EF5F933D219
  originating_commit_id: 12
                 status: RUNNING
              error_msg: 

More than ok, it implies that it actually applied tx id 12, the one that
caused the error.  This tx is still in the queue:


drizzle> select * from sys_replication.queue\G
*************************** 1. row ***************************
                 trx_id: 925
                 seg_id: 1
           commit_order: 12
originating_server_uuid: 9908C6AA-A982-4763-B9BA-4EF5F933D219
  originating_commit_id: 12
                    msg: transaction_context {
  server_id: 1
  transaction_id: 925
  start_timestamp: 1330211976689868
  end_timestamp: 1330211976689874
}
statement {
  type: DROP_SCHEMA
  start_timestamp: 1330211976689872
  end_timestamp: 1330211976689873
  drop_schema_statement {
    schema_name: "crash"
  }
}
segment_id: 1
end_segment: true

              master_id: 1

Suggested fix: when the slave encounters an error, update
sys_replication.applier_state.  I have seen
sys_replication.applier_state be updated on an error, but in this case
it doesn't work.  Perhaps it detects some errors but not others?

** Affects: drizzle
     Importance: Undecided
         Status: Confirmed


** Tags: replication slave-plugin

-- 
You received this bug notification because you are a member of UBUNTU -
AL - BR, which is subscribed to Drizzle.
https://bugs.launchpad.net/bugs/941176

Title:
  slave error doesn't change sys_replication.applier_state

Status in A Lightweight SQL Database for Cloud Infrastructure and Web Applications:
  Confirmed

Bug description:
  sys_replication.applier_state is not always correct; it can show
  RUNNING even though a slave error has broken replication.

  To reproduce:

  1. Start a master and create a schema called "crash".

  2. Start a slave with max-commit-id set to the tx id of that even on
  the master, so the slave does *not* create the crash schema.

  3. DROP SCHEMA crash; on the master.

  Replication on the slave will break.  Its error log shows:

  (SQLSTATE 00000) Can't drop schema 'crash'; schema doesn't exist
  Failure while executing:
  COMMIT
  DROP SCHEMA `crash`
  UPDATE `sys_replication`.`applier_state` SET `last_applied_commit_id` = 12, `originating_server_uuid` = '9908C6AA-A982-4763-B9BA-4EF5F933D219' , `originating_commit_id` = 12 WHERE `master_id` = 1

  But sys_replication.applier_state implies that replication is ok:

  drizzle> select * from sys_replication.applier_state\G
  *************************** 1. row ***************************
                master_id: 1
   last_applied_commit_id: 12
  originating_server_uuid: 9908C6AA-A982-4763-B9BA-4EF5F933D219
    originating_commit_id: 12
                   status: RUNNING
                error_msg: 

  More than ok, it implies that it actually applied tx id 12, the one
  that caused the error.  This tx is still in the queue:

  
  drizzle> select * from sys_replication.queue\G
  *************************** 1. row ***************************
                   trx_id: 925
                   seg_id: 1
             commit_order: 12
  originating_server_uuid: 9908C6AA-A982-4763-B9BA-4EF5F933D219
    originating_commit_id: 12
                      msg: transaction_context {
    server_id: 1
    transaction_id: 925
    start_timestamp: 1330211976689868
    end_timestamp: 1330211976689874
  }
  statement {
    type: DROP_SCHEMA
    start_timestamp: 1330211976689872
    end_timestamp: 1330211976689873
    drop_schema_statement {
      schema_name: "crash"
    }
  }
  segment_id: 1
  end_segment: true

                master_id: 1

  Suggested fix: when the slave encounters an error, update
  sys_replication.applier_state.  I have seen
  sys_replication.applier_state be updated on an error, but in this case
  it doesn't work.  Perhaps it detects some errors but not others?

To manage notifications about this bug go to:
https://bugs.launchpad.net/drizzle/+bug/941176/+subscriptions


Follow ups

References