← Back to team overview

maria-developers team mailing list archive

Re: Mdev-10664 Add statuses about optimistic parallel replication stalls.

 

Hi Kristian!

Sorry for late reply , I was on vacations.

On Tue, Jan 10, 2017 at 7:31 PM, Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
wrote:

> Sachin Setiya <sachin.setiya@xxxxxxxxxxx> writes:
>
> >> Right, but you still need _some_ way to synchronise multiple threads
>
> > I used atomic builtins my_atomic_add64_explicit.
>
> That seems fine (but see comment below).
>
> > Documentation:-
> >
> > Slave_DDL_Events:-
> >
> > Description: Number of DDL statements slave have executed in
> > replication channel X.
> > Scope: Per Replication Channel
> > Data Type: numeric
> > Introduced: MariaDB 10.1
> >
> > Slave_Non_Transactional_Events:-
> >
> > Description: Number of Non Transactional "no idea what should be the
> > next word " slave have executed in replication channel X.
> > Scope: Per Replication Channel
> > Data Type: numeric
> > Introduced: MariaDB 10.1
>
> I checked, the term "event group" is used already multiple places in the
> documentation, so I think it can be used here also. An event group can be a
> transaction (possibly with multiple transactional DML statements), or a DDL
> or non-transactional statement.
>
> What you are doing is introducing status variables to help understand
> optimistic (and aggressive) parallel replication modes.
>
> In optimistic parallel replication, a non-transactional event group is not
> safe to run in parallel with other non-transactional events. So that event
> is made to wait for all prior event groups to reach commit state, before
> being allowed to execute. However, other transactional events are not
> delayed by the non-transactional event group.
>
> Slave_Non_Transactional_Events counts the occurences of this.
>
> Then in case of a DDL statement, this can potentially be unsafe to run in
> parallel even with other transactional events. So for a DDL, the DDL has to
> wait for all prior event groups, _and_ subsequent event groups have to wait
> for the DDL.
>
> Slave_DDL_Events counts these occurences.
>
> This is what the documentation, and commit messages, should explain, I
> think.
>
> I think the status variables describe the values for
> @@default_master_connection?, this should also be explained in the
> documentation.
>
> Why did you decide to put this information into a status variable?
> Normally,
> slave status is seen in SHOW SLAVE STATUS, which supports showing status
> for
> a particular connection without using @@default_master_connection.
>
> Sorry for this , I guess It was a confusion I was looking at Slave_Running
this is avaliable in show status as well as
show slave status. So I thought for if I want to show some information in
SHOW slave status this has to be in show status.
And this is wrong. Now we can see non trans events / ddl events in show
slave status. Show I remove this vars from Show Status, or it is okay ?

> > diff --git a/sql/log_event.h b/sql/log_event.h
> > index 90900f6..579607f 100644
> > --- a/sql/log_event.h
> > +++ b/sql/log_event.h
> > @@ -50,6 +50,7 @@
> >  #endif
> >
> >  #include "rpl_gtid.h"
> > +#include "my_atomic.h"
>
> Why include my_atomic.h in the header file, when you only use it in the .cc
> file?
>
Changed.

>
> > +  mysql_mutex_lock(&LOCK_active_mi);
> > +  if (master_info_index)
> > +  {
> > +    mi= master_info_index->
> > +      get_master_info(&thd->variables.default_master_connection,
> > +                    Sql_condition::WARN_LEVEL_NOTE);
> > +    if (mi)
> > +      tmp= mi->total_ddl_events;
>
> > +    if (mi)
> > +      tmp= mi->total_non_trans_events;
>
> Better use an atomic primitive here to read the value. Taking a mutex does
> not help synchronise against my_atomic_add64_explicit().
>
> Changed.

> > +  }
> > +  mysql_mutex_unlock(&LOCK_active_mi);
> > +  if (mi)
> > +    *((longlong *)buff)= tmp;
> > +  else
> > +    var->type= SHOW_UNDEF;
> > +
> > +  return 0;
> > +}
> >  #endif /* HAVE_REPLICATION */
> >
> >  static int show_open_tables(THD *thd, SHOW_VAR *var, char *buff,
>
> > index 9365c06..a474aa5 100644
> > --- a/sql/rpl_mi.h
> > +++ b/sql/rpl_mi.h
> > @@ -303,6 +303,12 @@ class Master_info : public
> Slave_reporting_capability
> >
> >    /* The parallel replication mode. */
> >    enum_slave_parallel_mode parallel_mode;
> > +
> > +  /* No of DDL statements */
> > +  long total_ddl_events;
> > +
> > +  /* No of non-transactional statements*/
> > +  long total_non_trans_events;
>
> "long" is not the appropriate type here. Certainly not since you are using
> my_atomic_add64_explicit(). Use a type that is the same size on all
> platforms (and why use a signed type?).
>
> Changed to uint64.

>  - Kristian.
>


Attached a newer version of patch , please review it.
-- 
Regards
Sachin
diff --git a/mysql-test/include/check-testcase.test b/mysql-test/include/check-testcase.test
index 083f44c..9d8c0ec 100644
--- a/mysql-test/include/check-testcase.test
+++ b/mysql-test/include/check-testcase.test
@@ -67,10 +67,12 @@ if ($tmp)
   --echo Replicate_Do_Domain_Ids	
   --echo Replicate_Ignore_Domain_Ids	
   --echo Parallel_Mode	conservative
+  --echo Slave_DDL_Events	#
+  --echo Slave_Non_Transactional_Events	#
 }
 if (!$tmp) {
   # Note: after WL#5177, fields 13-18 shall not be filtered-out.
-  --replace_column 4 # 5 # 6 # 7 # 8 # 9 # 10 # 13 # 14 # 15 # 16 # 17 # 18 # 22 # 23 # 24 # 25 # 26 # 40 # 41 # 42 # 44 #
+  --replace_column 4 # 5 # 6 # 7 # 8 # 9 # 10 # 13 # 14 # 15 # 16 # 17 # 18 # 22 # 23 # 24 # 25 # 26 # 40 # 41 # 42 # 44 # 48 # 49 #
   query_vertical
   SHOW SLAVE STATUS;
 }
diff --git a/mysql-test/suite/multi_source/multi_parallel.result b/mysql-test/suite/multi_source/multi_parallel.result
new file mode 100644
index 0000000..5cbbaaf
--- /dev/null
+++ b/mysql-test/suite/multi_source/multi_parallel.result
@@ -0,0 +1,130 @@
+set global slave_parallel_threads=10;
+change master 'master1' to
+master_port=MYPORT_1,
+master_host='127.0.0.1',
+master_user='root';
+change master 'master2' to
+master_port=MYPORT_2,
+master_host='127.0.0.1',
+master_user='root';
+start all slaves;
+Warnings:
+Note	1937	SLAVE 'master2' started
+Note	1937	SLAVE 'master1' started
+set default_master_connection = 'master1';
+include/wait_for_slave_to_start.inc
+set default_master_connection = 'master2';
+include/wait_for_slave_to_start.inc
+## Slave status variable
+set default_master_connection = 'master1';
+show status like 'slave_running';
+Variable_name	Value
+Slave_running	ON
+set default_master_connection = 'master2';
+show status like 'slave_running';
+Variable_name	Value
+Slave_running	ON
+#master 1
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+create table t1(a int primary key);
+insert into t1 values(1);
+insert into t1 values(2);
+drop table t1;
+set default_master_connection = 'master1';
+show status like 'Slave_DDL_Events';
+Variable_name	Value
+Slave_ddl_events	20
+show status like 'Slave_Non_Transactional_Events';
+Variable_name	Value
+Slave_non_transactional_events	40
+#master 2
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+create table t2(a int primary key);
+insert into t2 values(1);
+insert into t2 values(2);
+drop table t2;
+set default_master_connection = 'master2';
+show status like 'Slave_DDL_Events';
+Variable_name	Value
+Slave_ddl_events	18
+show status like 'Slave_Non_Transactional_Events';
+Variable_name	Value
+Slave_non_transactional_events	36
+stop all slaves;
+Warnings:
+Note	1938	SLAVE 'master2' stopped
+Note	1938	SLAVE 'master1' stopped
+set default_master_connection = 'master1';
+include/wait_for_slave_to_stop.inc
+set default_master_connection = 'master2';
+include/wait_for_slave_to_stop.inc
+set global slave_parallel_threads=0;
+include/reset_master_slave.inc
+include/reset_master_slave.inc
+include/reset_master_slave.inc
diff --git a/mysql-test/suite/multi_source/multi_parallel.test b/mysql-test/suite/multi_source/multi_parallel.test
new file mode 100644
index 0000000..b4f60e3
--- /dev/null
+++ b/mysql-test/suite/multi_source/multi_parallel.test
@@ -0,0 +1,111 @@
+#multi source parallel replication
+
+--source include/not_embedded.inc
+--let $rpl_server_count= 0
+
+--connect (master1,127.0.0.1,root,,,$SERVER_MYPORT_1)
+--connect (master2,127.0.0.1,root,,,$SERVER_MYPORT_2)
+--connect (slave,127.0.0.1,root,,,$SERVER_MYPORT_3)
+
+#save state
+--let $par_thd= `select @@slave_parallel_threads;`
+
+set global slave_parallel_threads=10;
+
+--replace_result $SERVER_MYPORT_1 MYPORT_1
+eval change master 'master1' to
+master_port=$SERVER_MYPORT_1,
+master_host='127.0.0.1',
+master_user='root';
+
+--replace_result $SERVER_MYPORT_2 MYPORT_2
+eval change master 'master2' to
+master_port=$SERVER_MYPORT_2,
+master_host='127.0.0.1',
+master_user='root';
+
+
+#start all slaves
+
+start all slaves;
+
+set default_master_connection = 'master1';
+--source include/wait_for_slave_to_start.inc
+
+set default_master_connection = 'master2';
+--source include/wait_for_slave_to_start.inc
+
+--echo ## Slave status variable
+
+set default_master_connection = 'master1';
+show status like 'slave_running';
+
+set default_master_connection = 'master2';
+show status like 'slave_running';
+
+
+--echo #master 1
+--connection master1
+
+--let $counter=10
+while ($counter)
+{
+  #DDL statement
+  create table t1(a int primary key);
+
+  #non trans update statement
+  insert into t1 values(1);
+  insert into t1 values(2);
+
+  drop table t1;
+  --dec $counter
+}
+--save_master_pos
+
+--connection slave
+
+--sync_with_master 0,'master1'
+show slave 'master1' status like 'Slave_DDL_Events';
+show slave 'master1' status like 'Slave_Non_Transactional_Events';
+
+--echo #master 2
+--connection master2
+
+--let $counter=9
+while ($counter)
+{
+  #DDL statement
+  create table t2(a int primary key);
+
+  #non trans update statement
+  insert into t2 values(1);
+  insert into t2 values(2);
+
+  drop table t2;
+  --dec $counter
+}
+--save_master_pos
+
+--connection slave
+--sync_with_master 0,'master2'
+show slave 'master2' status like 'Slave_DDL_Events';
+show slave 'master2' status like 'Slave_Non_Transactional_Events';
+
+# Cleanup
+stop all slaves;
+set default_master_connection = 'master1';
+--source include/wait_for_slave_to_stop.inc
+
+set default_master_connection = 'master2';
+--source include/wait_for_slave_to_stop.inc
+
+--eval set global slave_parallel_threads=$par_thd
+
+--source include/reset_master_slave.inc
+--disconnect slave
+--connection master1
+--source include/reset_master_slave.inc
+--disconnect master1
+--connection master2
+--source include/reset_master_slave.inc
+--disconnect master2
diff --git a/sql/log_event.cc b/sql/log_event.cc
index ced2626..038c7a7 100644
--- a/sql/log_event.cc
+++ b/sql/log_event.cc
@@ -51,6 +51,7 @@
 #include "rpl_utility.h"
 #include "rpl_constants.h"
 #include "sql_digest.h"
+#include "my_atomic.h"
 
 #define my_b_write_string(A, B) my_b_write((A), (uchar*)(B), (uint) (sizeof(B) - 1))
 
@@ -6780,6 +6781,13 @@ Gtid_log_event::do_apply_event(rpl_group_info *rgi)
   }
 
   DBUG_ASSERT((bits & OPTION_GTID_BEGIN) == 0);
+
+  Master_info *mi=rgi->rli->mi;
+  if (flags2 & FL_DDL)
+    my_atomic_add64_explicit(&mi->total_ddl_events, 1, MY_MEMORY_ORDER_RELAXED);
+  if (!(flags & FL_TRANSACTIONAL))
+    my_atomic_add64_explicit(&mi->total_non_trans_events, 1, MY_MEMORY_ORDER_RELAXED);
+
   if (flags2 & FL_STANDALONE)
     return 0;
 
diff --git a/sql/mysqld.cc b/sql/mysqld.cc
index adb1b27..00d39fc 100644
--- a/sql/mysqld.cc
+++ b/sql/mysqld.cc
@@ -51,6 +51,7 @@
 #include "sql_manager.h"  // stop_handle_manager, start_handle_manager
 #include "sql_expression_cache.h" // subquery_cache_miss, subquery_cache_hit
 #include "sys_vars_shared.h"
+#include "my_atomic.h"
 
 #include <m_ctype.h>
 #include <my_dir.h>
@@ -7805,7 +7806,59 @@ static int show_heartbeat_period(THD *thd, SHOW_VAR *var, char *buff,
   return 0;
 }
 
+static long show_slave_ddl_events(THD *thd, SHOW_VAR *var, char *buff,
+                                 enum enum_var_type scope)
+{
+  Master_info *mi= NULL;
+  longlong UNINIT_VAR(tmp);
+
+  var->type= SHOW_LONGLONG;
+  var->value= buff;
+
+  mysql_mutex_lock(&LOCK_active_mi);
+  if (master_info_index)
+  {
+    mi= master_info_index->
+      get_master_info(&thd->variables.default_master_connection,
+                    Sql_condition::WARN_LEVEL_NOTE);
+    if (mi)
+      tmp= my_atomic_load64_explicit(&mi->total_ddl_events, MY_MEMORY_ORDER_RELAXED);
+  }
+  mysql_mutex_unlock(&LOCK_active_mi);
+  if (mi)
+    *((longlong *)buff)= tmp;
+  else
+    var->type= SHOW_UNDEF;
+
+  return 0;
+}
 
+static long show_slave_non_trans_events(THD *thd, SHOW_VAR *var, char *buff,
+                                 enum enum_var_type scope)
+{
+  Master_info *mi= NULL;
+  longlong UNINIT_VAR(tmp);
+
+  var->type= SHOW_LONGLONG;
+  var->value= buff;
+
+  mysql_mutex_lock(&LOCK_active_mi);
+
+  {
+    mi= master_info_index->
+      get_master_info(&thd->variables.default_master_connection,
+                    Sql_condition::WARN_LEVEL_NOTE);
+    if (mi)
+      tmp= my_atomic_load64_explicit(&mi->total_non_trans_events, MY_MEMORY_ORDER_RELAXED);
+  }
+  mysql_mutex_unlock(&LOCK_active_mi);
+  if (mi)
+    *((longlong *)buff)= tmp;
+  else
+    var->type= SHOW_UNDEF;
+
+  return 0;
+}
 #endif /* HAVE_REPLICATION */
 
 static int show_open_tables(THD *thd, SHOW_VAR *var, char *buff,
@@ -8473,6 +8526,8 @@ SHOW_VAR status_vars[]= {
   {"Slave_retried_transactions",(char*)&slave_retried_transactions, SHOW_LONG},
   {"Slave_running",            (char*) &show_slave_running,     SHOW_SIMPLE_FUNC},
   {"Slave_skipped_errors",     (char*) &slave_skipped_errors, SHOW_LONGLONG},
+  {"Slave_DDL_Events",     (char*) &show_slave_ddl_events, SHOW_SIMPLE_FUNC},
+  {"Slave_Non_Transactional_Events", (char *) &show_slave_non_trans_events, SHOW_SIMPLE_FUNC},
 #endif
   {"Slow_launch_threads",      (char*) &slow_launch_threads,    SHOW_LONG},
   {"Slow_queries",             (char*) offsetof(STATUS_VAR, long_query_count), SHOW_LONG_STATUS},
diff --git a/sql/rpl_mi.cc b/sql/rpl_mi.cc
index 6048d26..85a5e66 100644
--- a/sql/rpl_mi.cc
+++ b/sql/rpl_mi.cc
@@ -40,7 +40,8 @@ Master_info::Master_info(LEX_STRING *connection_name_arg,
    sync_counter(0), heartbeat_period(0), received_heartbeats(0),
    master_id(0), prev_master_id(0),
    using_gtid(USE_GTID_NO), events_queued_since_last_gtid(0),
-   gtid_reconnect_event_skip_count(0), gtid_event_seen(false)
+   gtid_reconnect_event_skip_count(0), gtid_event_seen(false),
+   total_ddl_events(0), total_non_trans_events(0)
 {
   host[0] = 0; user[0] = 0; password[0] = 0;
   ssl_ca[0]= 0; ssl_capath[0]= 0; ssl_cert[0]= 0;
diff --git a/sql/rpl_mi.h b/sql/rpl_mi.h
index 9365c06..7e28d4d 100644
--- a/sql/rpl_mi.h
+++ b/sql/rpl_mi.h
@@ -303,6 +303,12 @@ class Master_info : public Slave_reporting_capability
 
   /* The parallel replication mode. */
   enum_slave_parallel_mode parallel_mode;
+
+  /* No of DDL statements */
+  uint64 total_ddl_events;
+
+  /* No of non-transactional statements*/
+  uint64 total_non_trans_events;
 };
 
 int init_master_info(Master_info* mi, const char* master_info_fname,
diff --git a/sql/slave.cc b/sql/slave.cc
index 1846938..0feace4 100644
--- a/sql/slave.cc
+++ b/sql/slave.cc
@@ -2823,6 +2823,15 @@ void show_master_info_get_fields(THD *thd, List<Item> *field_list,
                                             gtid_pos_length),
                           mem_root);
   }
+  field_list->push_back(new (mem_root)
+                       Item_return_int(thd, "Slave_DDL_Events", 20,
+                                       MYSQL_TYPE_LONGLONG),
+                       mem_root);
+
+  field_list->push_back(new (mem_root)
+                       Item_return_int(thd, "Slave_Non_Transactional_Events", 20,
+                                       MYSQL_TYPE_LONGLONG),
+                       mem_root);
   DBUG_VOID_RETURN;
 }
 
@@ -3017,6 +3026,11 @@ static bool send_show_master_info_data(THD *thd, Master_info *mi, bool full,
       protocol->store((double)    mi->heartbeat_period, 3, &tmp);
       protocol->store(gtid_pos->ptr(), gtid_pos->length(), &my_charset_bin);
     }
+    uint64 events;
+    events= my_atomic_load64_explicit(&mi->total_ddl_events, MY_MEMORY_ORDER_RELAXED);
+    protocol->store(events);
+    events= my_atomic_load64_explicit(&mi->total_non_trans_events, MY_MEMORY_ORDER_RELAXED);
+    protocol->store(events);
 
     mysql_mutex_unlock(&mi->rli.err_lock);
     mysql_mutex_unlock(&mi->err_lock);

Follow ups

References