← Back to team overview

maria-developers team mailing list archive

Re: multi-source replication, first (untested) version

 

Michael Widenius <monty@xxxxxxxxxxxx> writes:

> I have now got the first version of multi source patch ported to
> MariadB.

As requested, I did a quick read through of the patch. I took the patch from
your 10.0-mdev253 tree, this revision:

    revid:monty@xxxxxxxxxxxx-20120930233044-e0m6u6t9q3jeqx03

Below are my comments. Overall, it was nice to see the relatively modest
changes needed to get this working.

 - Kristian.

-----------------------------------------------------------------------

> === modified file 'sql/sys_vars.cc'
> --- sql/sys_vars.cc	2012-09-22 12:30:24 +0000
> +++ sql/sys_vars.cc	2012-10-01 09:55:23 +0000

> +static Sys_var_multi_source_ulong
> +Sys_slave_skip_counter("sql_slave_skip_counter",
> +                       "Skip the next N events from the master log",
> +                       SESSION_VAR(slave_skip_counter),
> +                       NO_CMD_LINE,
> +                       offsetof(Master_info, rli.slave_skip_counter),

This use of offsetof() on a C++ class (non-POD) is incorrect. Even if it works
for you with your current version of GCC and current compiler options, it is
likely to break at some point on some future compiler. And it will be hell to
track down.

Look, I understand that you do not like some of the ways that the C++ language
is defined, like here that offsetof() is not allowed on non-POD data
structures. But we have to live with it. Surely, it is better to have a
standard language that compiler writers and compiler users agree on how works
- even if that language is less convenient than what we would like.

So let us please change this to be correct C++. I can change it for you if you
want.


> === modified file 'sql/slave.cc'
> --- sql/slave.cc	2012-09-13 12:31:29 +0000
> +++ sql/slave.cc	2012-10-01 09:55:23 +0000

> +bool show_all_master_info(THD* thd)
> +{
> +  uint i, elements;
> +  Master_info **tmp;
> +  DBUG_ENTER("show_master_info");
> +
> +  if (send_show_master_info_header(thd, active_mi, 1))
> +    DBUG_RETURN(TRUE);
> +
> +  if (!(elements= master_info_index->master_info_hash.records))
> +    goto end;
> +
> +  /*
> +    Sort lines to get them into a predicted order
> +    (needed for test cases and to not confuse users)
> +  */
> +  if (!(tmp= (Master_info**) thd->alloc(sizeof(Master_info*) * elements)))
> +    DBUG_RETURN(TRUE);
> +
> +  for (i= 0; i < elements; i++)
> +  {
> +    tmp[i]= (Master_info *) my_hash_element(&master_info_index->
> +                                            master_info_hash, i);
> +  }

Isn't there a race condition here? What happens if the number of elements
changes between looking up number of elements and the loop?

If there is a lock protecting this, please add a safe_mutex_assert_owner()
and/or a comment.

The call of show_all_master_info() is like this:

   mysql_mutex_lock(&LOCK_active_mi);
   if (lex->verbose)
     res= show_all_master_info(thd);

Does LOCK_active_mi protect all changes to multi-master state? If so, I think
it should be renamed - "active_mi" is rather confusing in this case.


> === modified file 'sql/sql_class.h'
> --- sql/sql_class.h	2012-09-22 14:11:40 +0000
> +++ sql/sql_class.h	2012-10-01 09:55:23 +0000

> +  /**
> +     Place holders to store Multi-source variables in sys_var.cc during
> +     update and show of variables.
> +  */
> +  ulong slave_skip_counter;
> +  ulong max_relay_log_size;

> +  /* Names. These will be allocated in buffers in thd */
> +  LEX_STRING default_master_connection;
> +

> +  LEX_STRING connection_name;                   /* If slave */
> +  char       default_master_connection_buff[MAX_CONNECTION_NAME+1];

Is there not some other way to do this?

THD seems to have become some kind of garbage dump for all kinds of
special-purpose data. This is not good, it pollutes the CPU caches for all
threads, and generally makes everything a mess :-/

I would like to see us move towards having less things in THD that are not
relevant to all threads, not more.

Could we add a single void * somewhere in THD for special-purpose threads?
And then replication threads could use that for their special-purpose stuff.
Or maybe sub-class THD.

Then later we could start moving other stuff out of THD that is not relevant
to all threads...


> === modified file 'sql/rpl_rli.h'
> --- sql/rpl_rli.h	2012-08-27 16:13:17 +0000
> +++ sql/rpl_rli.h	2012-10-01 09:55:23 +0000

> +  volatile ulong slave_skip_counter;    /* Must be ulong */

Such comments are extremely unhelpful. Instead explain why it must be ulong.


> === modified file 'sql/item_create.cc'
> --- sql/item_create.cc	2012-01-13 14:50:02 +0000
> +++ sql/item_create.cc	2012-10-01 09:55:23 +0000
> @@ -4393,27 +4393,36 @@ Create_func_master_pos_wait::create_nati

> +  thd->lex->safe_to_cache_query= 0;

>    case 3:

>      thd->lex->safe_to_cache_query= 0;

> +  case 4:

> +    thd->lex->safe_to_cache_query= 0;

Isn't it redundant to set safe_to_cache_query twice here?


> === modified file 'sql/rpl_mi.cc'
> --- sql/rpl_mi.cc	2012-03-27 23:04:46 +0000
> +++ sql/rpl_mi.cc	2012-10-01 09:55:23 +0000

> +  /* Store connection name and lower case connection name */
> +  connection_name.length= cmp_connection_name.length=
> +    connection_name_arg->length;
> +  if ((connection_name.str= (char*) my_malloc(connection_name_arg->length*2+2,
> +                                              MYF(MY_WME))))
> +  {
> +    cmp_connection_name.str= (connection_name.str +
> +                              connection_name_arg->length+1);
> +    strmake(connection_name.str, connection_name_arg->str,
> +            connection_name.length);
> +    memcpy(cmp_connection_name.str, connection_name_arg->str,
> +           connection_name.length+1);
> +    my_casedn_str(system_charset_info, cmp_connection_name.str);
> +  }

Is it really correct to ignore out-of-memory error here, and just do nothing?

Ah, I think this is handled by checking Master_info::error() in other places?
If so, then please add a comment here explaining this.


> +/**
> +   Create a log file with a signed suffix.

I do not know what is meant with the term "signed suffix". This is not
something that people will understand.

Does it mean "Create a log file name by appending a dash "-" followed by a
suffix" ?

Also rename the file create_signed_file_name() - again, the "signed" is
confusing. For example "create_augmented_file_name", or
"create_name_with_dash_and_suffix" or something.


> +   It's valid if it's a valid system name, is less than
> +   MAX_CONNECTION_NAME.

"It is valid if it is a valid system name of length less than
MAX_CONNECTION_NAME".


> === modified file 'client/mysqltest.cc'
> --- client/mysqltest.cc	2012-08-31 21:54:54 +0000
> +++ client/mysqltest.cc	2012-10-01 09:55:23 +0000

> +  if (buff)
> +    my_free(start);

Better use my_free(buff) here.


> === modified file 'sql/rpl_mi.h'
> --- sql/rpl_mi.h	2012-03-27 23:04:46 +0000
> +++ sql/rpl_mi.h	2012-10-01 09:55:23 +0000

> +  char master_log_name[FN_REFLEN+6]; /* Place for multi-*/

"place for" -> "room for".


> === modified file 'sql/mysqld.cc'
> --- sql/mysqld.cc	2012-09-22 14:11:40 +0000
> +++ sql/mysqld.cc	2012-10-01 09:55:23 +0000

> +  /*
> +    We must have LOCK_open before LOCK_global_system_variables because
> +    LOCK_open is hold while sql_plugin.c::intern_sys_var_ptr() is called.

"is hold" -> "is held"


> === modified file 'sql/item_create.cc'
> --- sql/item_create.cc	2012-01-13 14:50:02 +0000
> +++ sql/item_create.cc	2012-10-01 09:55:23 +0000
> @@ -4393,27 +4393,36 @@ Create_func_master_pos_wait::create_nati
>    if (item_list != NULL)
>      arg_count= item_list->elements;
>  
> +  if (arg_count < 2 || arg_count >4)

Missing space before "4"


Follow ups