← Back to team overview

linux-traipu team mailing list archive

[Bug 867866] Re: multi-master replication test failing - duplicate trx message error/ first master rpl failure.

 

Looks like the remerge of the multisource replication code went wonky
somewhere and the primary key on the sys_replication.queue table is not
correct. Missing the master_id column. I'm not sure how this code was
remerged so there may be other hidden issues like this, but fixing the
PK seems to be the fix for this particular bug.

I think there is an issue with the timing of the test case, though. Told
Patrick about it and he's got a fix for that.

** Changed in: drizzle/fremont
     Assignee: David Shrewsbury (dshrews) => Patrick Crews (patrick-crews)

-- 
You received this bug notification because you are a member of UBUNTU -
AL - BR, which is subscribed to Drizzle.
https://bugs.launchpad.net/bugs/867866

Title:
  multi-master replication test failing - duplicate trx message error/
  first master rpl failure.

Status in A Lightweight SQL Database for Cloud Infrastructure and Web Applications:
  Confirmed
Status in Drizzle fremont series:
  Confirmed

Bug description:
  So, I have created the infrastructure to setup multi-master topologies for testing.
  The gist of the test is we spin 3 servers = 2 masters, one slave replicating from both.
  master1 = we create test.t1 and some records
  master2 = we create test.t2 and some records
  When we try:
  ./dbqp --suite=slave --record multi_master_basic, the test fails as follows (slave crashes)
  $ cat workdir/bot0/s2/var/log/s2.err 
  InnoDB: Doublewrite buffer not found: creating new
  InnoDB: Doublewrite buffer created
  InnoDB: 127 rollback segment(s) active.
  InnoDB: Creating foreign key constraint system tables
  InnoDB: Foreign key constraint system tables created
  (SQLSTATE 00000) Duplicate entry '772-1' for key 'PRIMARY'
  Failure while executing:
  INSERT INTO `sys_replication`.`queue` (`master_id`, `trx_id`, `seg_id`, `commit_order`,  `originating_server_uuid`, `originating_commit_id`, `msg`) VALUES (2, 772, 1, 1, 'ac9c8ac0-8f10-474b-9bbd-b61d2cdb2b93' , 1, 'transaction_context {
    server_id: 1
    transaction_id: 772
    start_timestamp: 1317760732106016
    end_timestamp: 1317760732106017
  }
  event {
    type: STARTUP
  }
  segment_id: 1
  end_segment: true
  ')

  
  Replication slave: Unable to insert into queue.
  Replication slave: drizzle_state_read:lost connection to server (EOF)
  Lost connection to master. Reconnecting.
  Replication slave: drizzle_state_connect:could not connect
  111004 16:39:05  InnoDB: Starting shutdown...

  Testing the setup with --start-and-exit shows that we only seem to be replicating from master 2, not master 1.
  the config file is as follows:
  ignore-errors

  [master1]
  master-host=127.0.0.1
  master-port=9306
  master-user=root
  master-pass=''

  
  [master2]
  master-host=127.0.0.1
  master-port=9312
  master-user=root
  master-pass=''

To manage notifications about this bug go to:
https://bugs.launchpad.net/drizzle/+bug/867866/+subscriptions


References