← Back to team overview

maria-developers team mailing list archive

Re: Incorrect format description event skipping in queue_event()


On Fri, Aug 30, 2013 at 1:44 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
> Pavel Ivanov <pivanof@xxxxxxxxxx> writes:
>> The attached patch doesn't work properly because the format
>> description event it saves is different from what master has sent. It
>> has different number_of_event_types and potentially it has garbage (or
>> incomplete data) in post_header_len.
> Aha, so it's because of this, in Format_description_log_event::write() ?
>   size_t rec_size= sizeof(buff);
>   memcpy((char*) buff+ST_COMMON_HEADER_LEN_OFFSET + 1, (uchar*) post_header_len,
>          LOG_EVENT_TYPES);
>   ret= (write_header(file, rec_size) ||
>         wrapper_my_b_safe_write(file, buff, rec_size) ||
> So it uses the static size of own server version, not the one of the event
> that was read. And we get the effect you describe.
> So then it seems the fix is to modify the created and/or artificial (log_pos)
> fields manually in the buffer containing the master's description event, fix
> the checksum, and then write it out normally (not skip)? Or do you have a
> better suggestion?

Why not change Format_description_log_event::write()? It seems
actually that it will be more correct if we change it. I made the
following change (line numbers may be off) and it worked:

@@ -4922,16 +4947,15 @@ bool Format_description_log_event::write(IO_CACHE* file)
     We don't call Start_log_event_v3::write() because this would make 2
-  size_t rec_size= sizeof(buff);
+  uchar buff[START_V3_HEADER_LEN+1];
+  size_t rec_size= sizeof(buff) + BINLOG_CHECKSUM_ALG_DESC_LEN +
+                   number_of_event_types;
   int2store(buff + ST_BINLOG_VER_OFFSET,binlog_version);
   memcpy((char*) buff + ST_SERVER_VER_OFFSET,server_version,ST_SERVER_VER_LEN);
   if (!dont_set_created)
     created= get_time();
   int4store(buff + ST_CREATED_OFFSET,created);
   buff[ST_COMMON_HEADER_LEN_OFFSET]= common_header_len;
-  memcpy((char*) buff+ST_COMMON_HEADER_LEN_OFFSET + 1, (uchar*)
-         LOG_EVENT_TYPES);
     if checksum is requested
     record the checksum-algorithm descriptor next to
@@ -4944,7 +4968,7 @@ bool Format_description_log_event::write(IO_CACHE* file)
 #ifndef DBUG_OFF
   data_written= 0; // to prepare for need_checksum assert
-  buff[FORMAT_DESCRIPTION_HEADER_LEN]= need_checksum() ?
+  uchar checksum_byte= need_checksum() ?
     checksum_alg : (uint8) BINLOG_CHECKSUM_ALG_OFF;
      FD of checksum-aware server is always checksum-equipped, (V) is in,
@@ -4964,7 +4988,9 @@ bool Format_description_log_event::write(IO_CACHE* file)
     checksum_alg= BINLOG_CHECKSUM_ALG_CRC32;  // Forcing (V) room to
fill anyway
   ret= (write_header(file, rec_size) ||
-        wrapper_my_b_safe_write(file, buff, rec_size) ||
+        wrapper_my_b_safe_write(file, buff, sizeof(buff)) ||
+        wrapper_my_b_safe_write(file, (uchar*) post_header_len,
number_of_event_types) ||
+        wrapper_my_b_safe_write(file, &checksum_byte, sizeof(checksum_byte)) ||
   if (no_checksum)
     checksum_alg= BINLOG_CHECKSUM_ALG_OFF;

> What I do not understand is why we do not have the exact same bug when the
> slave rotates the relay log due to max_relay_log_size? It seems to me it does
> the exact same thing? It writes out description_event_for_queue which can have
> different number_of_event_types.

Probably I just didn't test this properly yet. Looking at the code,
yes it shouldn't work, but the write() fix above should fix both

>> quite well at the moment to do that. So apparently all peek*()
>> functions should have Format_description added as parameter and it's
>> easy to pass one from queue_event(), but then e.g. mysql_binlog_send()
>> will need description event too to pass it to send_event_to_slave()
>> and is_until_reached() so that those could pass it to peek*(). For
>> correctness this format description should be read from binlog, right?
>> That would be pretty non-trivial to do.
> Isn't it just a matter of reading in each format_description_log_event as it
> is seen and holding on to it in some description_event_for_binlog_send? Should
> be easy, right?

Well, mysql_binlog_send() doesn't start to read the binlog file from
the beginning. And if slave connects not using GTID but by binlog
position no one reads the beginning of the binlog file before
mysql_binlog_send() too. So that means that we have to add scanning of
binlogs from the beginning inside mysql_binlog_send(). Of course, it
will have an additional benefit of issuing sane error message if the
binlog position slave requests is in the middle of event. But it's
still a non-trivial change...

> If you are interested in helping with testing this, I can write a patch later,
> but I need a way to test it without having to spend unreasonable amounts of
> time on it.

Sure, I'll be happy to test.


Follow ups