← Back to team overview

maria-developers team mailing list archive

Re: 9b453c1ebf4: MDEV-10963 Fragmented BINLOG query

 

Hi, Andrei!

On Sep 07, andrei.elkin@xxxxxxxxxx wrote:
> revision-id: 9b453c1ebf47a20ea1436de3284f8810d0e64c4c (mariadb-10.1.34-8-g9b453c1ebf4)
> parent(s): c09a8b5b36edb494e2bcc93074c06e26cd9f2b92
> author: Andrei Elkin
> committer: Andrei Elkin
> timestamp: 2018-09-07 20:36:16 +0300
> message:
> 
> MDEV-10963 Fragmented BINLOG query
> 
> The problem was originally stated in
>   http://bugs.mysql.com/bug.php?id=82212
> The size of an base64-encoded Rows_log_event exceeds its
> vanilla byte representation in 4/3 times.
> When a binlogged event size is about 1GB mysqlbinlog generates
> a BINLOG query that can't be send out due to its size.
> 
> It is fixed with fragmenting the BINLOG argument C-string into
> (approximate) halves when the base64 encoded event is over 1GB size.
> The mysqlbinlog in such case puts out
> 
>     SET @binlog_fragment_0='base64-encoded-fragment_0';
>     SET @binlog_fragment_1='base64-encoded-fragment_1';
>     BINLOG 2, 'binlog_fragment';

No, that's totally ridiculous. A statement that internally walks the
list of user variables and collects values from variables named
according to a specific pattern, seriously?

Please, make it

  BINLOG CONCAT(@binlog_fragment_0, @binlog_fragment_1)

that'll work with no questions asked, everybody understands what it
means. The parser doesn't need to accept an arbitrary expression here,
it'd be simpler and safer to hard-code the syntax as above.

A couple of minor generic comments below, but I've stopped reviewing,
because the code will change anyway.

> to represent a big BINLOG 'base64-encoded-"total"'.
> Two more statements are composed to promptly release memory
>     SET @binlog_fragment_0=NULL;
>     SET @binlog_fragment_1=NULL;
> 
> diff --git a/client/mysqlbinlog.cc b/client/mysqlbinlog.cc
> index 9753125dd67..f0689631cfb 100644
> --- a/client/mysqlbinlog.cc
> +++ b/client/mysqlbinlog.cc
> @@ -56,7 +56,13 @@ Rpl_filter *binlog_filter= 0;
>  
>  #define BIN_LOG_HEADER_SIZE	4
>  #define PROBE_HEADER_LEN	(EVENT_LEN_OFFSET+4)
> -
> +/*
> +  2 fragments can always represent near 1GB row-based base64-encoded event as
> +  two strings each of size less than max(max_allowed_packet).
> +  Bigger number of fragments does not safe from potential need to tune (increase)
> +  @@max_allowed_packet before to process the fragments. So 2 is safe and enough.
> +*/
> +#define BINLOG_ROWS_EVENT_ENCODED_FRAGMENTS 2
>  
>  #define CLIENT_CAPABILITIES	(CLIENT_LONG_PASSWORD | CLIENT_LONG_FLAG | CLIENT_LOCAL_FILES)
>  
> @@ -71,6 +77,9 @@ ulong bytes_sent = 0L, bytes_received = 0L;
>  ulong mysqld_net_retry_count = 10L;
>  ulong open_files_limit;
>  ulong opt_binlog_rows_event_max_size;
> +#ifndef DBUG_OFF
> +ulong opt_binlog_rows_event_max_encoded_size;
> +#endif

Try to avoid unnecessary ifdefs. Here you could've defined the variable
unconditionally and suppressed the warning with __attribute__((unused)),
of you could've used it unconditionally as a splitting threshold below
and only hidden the command line option.

>  uint test_flags = 0; 
>  static uint opt_protocol= 0;
>  static FILE *result_file;
> @@ -813,7 +822,14 @@ write_event_header_and_base64(Log_event *ev, FILE *result_file,
>  
>    /* Write header and base64 output to cache */
>    ev->print_header(head, print_event_info, FALSE);
> -  ev->print_base64(body, print_event_info, FALSE);
> +
> +  /* the assert states the only current use case for the function */
> +  DBUG_ASSERT(print_event_info->base64_output_mode ==
> +              BASE64_OUTPUT_ALWAYS);

1. don't document the obvious
2. don't wrap the line if it fits in 80-char line

> +
> +  ev->print_base64(body, print_event_info,
> +                   print_event_info->base64_output_mode !=
> +                   BASE64_OUTPUT_DECODE_ROWS);
>  
>    /* Read data from cache and write to result file */
>    if (copy_event_cache_to_file_and_reinit(head, result_file) ||
> @@ -1472,6 +1490,15 @@ that may lead to an endless loop.",
>     "This value must be a multiple of 256.",
>     &opt_binlog_rows_event_max_size, &opt_binlog_rows_event_max_size, 0,
>     GET_ULONG, REQUIRED_ARG, UINT_MAX,  256, ULONG_MAX,  0, 256,  0},
> +#ifndef DBUG_OFF
> +  {"binlog-row-event-max-encoded-size", 0,

all debug command-line options must start with "debug-", that is,
either rename it to "debug-binlog-row-event-max-encoded-size" or
make it non-debug.

> +   "The maximum size of base64-encoded rows-event in one BINLOG pseudo-query "
> +   "instance. When the computed actual size exceeds the limit "
> +   "the BINLOG's argument string is fragmented in two.",
> +   &opt_binlog_rows_event_max_encoded_size,
> +   &opt_binlog_rows_event_max_encoded_size, 0,
> +   GET_ULONG, REQUIRED_ARG, UINT_MAX/4,  256, ULONG_MAX,  0, 256,  0},
> +#endif
>    {"verify-binlog-checksum", 'c', "Verify checksum binlog events.",
>     (uchar**) &opt_verify_binlog_checksum, (uchar**) &opt_verify_binlog_checksum,
>     0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0},

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


Follow ups