maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #11470
Re: Correct parsing of BINLOG 'XXX' with comment inside
Alexander, Sachin, hello.
> Hi Andrei,
>
> We need your help on:
>
> MDEV-10362 mysqlbinlog verbose output cannot be decoded
>
>
> The problem is that "mysqlbinlog -vvv" writes
> comments right inside the BINLOG statement,
> between base64 chunks:
>
>
> BINLOG '
> kxiEVxMBAAAALQAAAMUBAAAAABMAAAAAAAEABHRlc3QAAnQxAAMIDwMCIAAG
> ...
> AAAAAAAAAQAAAPo/AAAAAAAAAAAAAAD6QAAAAAAAAAABAAAA+kAAAAAAAAAAAAAAAA==
> ### UPDATE `test`.`t1`
> ### WHERE
> ### @1=1 /* LONGINT meta=0 nullable=0 is_null=0 */
> ### @2=NULL /* LONGINT meta=32 nullable=1 is_null=1 */
> ### @3=1 /* INT meta=0 nullable=1 is_null=0 */
> AAAAAAD63QUAAAAAAAABAAAA+t0FAAAAAAAAAAAAAPreBQAAAAAAAAEAAAD63gUAAAAAAAAAAAAA
> '/*!*/;
Right, the comments get embedded into base64 stream for no reason.
>
>
> So when later we try to load this output to "mysql", it fails,
> because when mysql_client_binlog_statement() calls base64_decode(),
> the latter fails as it does not expect any comments.
>
> Sachin proposes to add a new flag to base64_decode(), to make
> skip comments when this flags is on.
>
> I'm not sure that we should do this.
>
> Would it be possible to fix mysqlbinlog instead, to print:
> - a BINLOG statement with only base64 data, without any comments inside
> - followed by comments, fully outside of the BINLOG statement.
> ?
A sort of what you both are suggesting, let's add up one more cache,
pass it to print_base64(body, vcom, print_event_info,...) [ actually,
do we really need the cache args as long as they represent
`print_event_info' members? - to check out closely]
void Rows_log_event::print_helper(FILE *file,
PRINT_EVENT_INFO *print_event_info,
char const *const name)
{
IO_CACHE *const head= &print_event_info->head_cache;
IO_CACHE *const body= &print_event_info->body_cache;
+ IO_CACHE ... vcom= ... // vcom after Verbose COMments
bool do_print_encoded=
print_event_info->base64_output_mode != BASE64_OUTPUT_DECODE_ROWS &&
!print_event_info->short_form;
if (!print_event_info->short_form)
{
bool const last_stmt_event= get_flags(STMT_END_F);
print_header(head, print_event_info, !last_stmt_event);
my_b_printf(head, "\t%s: table id %lu%s\n",
name, m_table_id,
last_stmt_event ? " flags: STMT_END_F" : "");
- print_base64(body, print_event_info, do_print_encoded);
+ print_base64(body, vcom, print_event_info, do_print_encoded);
so inside
void Log_event::print_base64(IO_CACHE* file,
+ IO_CACHE* vcom,
PRINT_EVENT_INFO* print_event_info,
bool do_print_encoded)
...
if (print_event_info->verbose)
{
...
if (ev)
{
- ev->print_verbose(file, print_event_info);
+ ev->print_verbose(vcom, print_event_info);
Finally, empty up the vcom cache in a correct order.
Should it be before BINLOG '...' I think, right?
if (get_flags(STMT_END_F))
{
...
copy_event_cache_to_file_and_reinit(head, file);
+ copy_event_cache_to_file_and_reinit(vcom, file);
copy_event_cache_to_file_and_reinit(body, file);
Sachin, regardless whether you like this elaboration,
you got to check out bb-10.1-MDEV-10963 or the latest
patch in the commit list to find some refactoring in this area.
There's a chance it would be pushed before the current task's one.
Cheers,
Andrei
>
> Or even, should not we split such BINLOG statement with multiple
> base64 chunks into individual BINLOG statements?
>
>
> Thanks.
>
>
> On 10/16/2018 04:03 PM, Sachin Setiya wrote:
>> Hi Bar!
>> On Tue, Oct 16, 2018 at 7:33 AM Alexander Barkov <bar@xxxxxxxxxxx> wrote:
>>>
>>> Hi Sachin,
>>>
>>>
>>> On 10/16/2018 03:49 AM, Sachin Setiya wrote:
>>>> Hi Bar!
>>>> On Tue, Sep 11, 2018 at 1:47 PM Alexander Barkov <bar@xxxxxxxxxxx> wrote:
>>>>>
>>>>> Hi Sachin,
>>>>>
>>>>>
>>>>> On 09/10/2018 03:47 PM, Sachin Setia wrote:
>>>>>> Hi Bar,
>>>>>>
>>>>>> Currently if we have generated the sql file using mysqlbinlog -v and
>>>>>> if we have big statement binlog
>>>>>> , then mysqlbinlog add comment in between of Binlog 'XXX' statement ,
>>>>>> but unfortunately base64_decode
>>>>>> does not understands it, and it through error since # is not a base64 char
>>>>>>
>>>>>> This patches solves this.
>>>>>
>>>>> Note, base64_decode() is used in at least two places:
>>>>> - for the binary log
>>>>> - for the SQL function FROM_BASE64()
>>>>>
>>>>>
>>>>> I don't like that your patch changes the behavior of the SQL function
>>>>> FROM_BASE64(). It should not recognize any '#' inside the base64 data
>>>>> as comments.
>>>> Right , So I have creates a flag MY_BASE64_DECODE_ALLOW_COMMENTS it is
>>>> kind of similar to as MY_BASE64_DECODE_ALLOW_MULTIPLE_CHUNKS flag is used
>>>> So no impact on FROM_BASE64
>>>>
>>>>>
>>>>> So perhaps this should be fixed in some other place, not in base64_decode().
>>>>>
>>>>>
>>>>> I checked the output of these commands:
>>>>>
>>>>> ./bin/mysqlbinlog ./data/retsina-bin.000003
>>>>> ./bin/mysqlbinlog -vvv ./data/retsina-bin.000003
>>>>>
>>>>> It looks suspicious for me.
>>>>>
>>>>> Can you please remind why mysqlbinlog prints multiple base64 chunks
>>>>> inside the same BINLOG statement?
>>>> It is beacuse if we have long row we can have more then one
>>>> ROWS_LOG_EVENT after TABLE_MAP_EVENT
>>>
>>> Do you know where in the code the server decides to end
>>> the current chunk and start a new one?
>> Here In binlog_prepare_pending_rows_event
>> if (!pending ||
>> pending->server_id != serv_id ||
>> pending->get_table_id() != table->s->table_map_id ||
>> pending->get_general_type_code() != general_type_code ||
>> pending->get_data_size() + needed > opt_binlog_rows_event_max_size ||
>> ^^^ this line
>> pending->get_width() != colcnt ||
>> !bitmap_cmp(pending->get_cols(), cols))
>> {
>>
>>>
>>> Is there some limit (in bytes, or in number or records)?
>>>
>>>>>
>>>>>
>>>>> From my understanding, it is "mysqlbinlog -vvv" who should be fixed
>>>>> to print good base64 data inside the string literal that follows the
>>>>> BINLOG keyword.
>>>>>
>>>>> At least the additional comments printed by -vvv should be outside of
>>>>> the BINLOG statement (presumably before), not inside.
>>>>>
>>>> Idk , this can be lot of work , plus it will slow mysqlbinlog also ,
>>>> And I think with new solution FROM_BASE64 behavior is not changed
>>>> I have added test for FROM_BASE64
>>>> #No change in BASE64_FROM behaviour
>>>> SELECT FROM_BASE64('TWFy
>>>> #Comment 344
>>>> #
>>>> aWE=') AS 'Output';
>>>> this will generate Maria in earlier patch which is wrong , with new
>>>> patch it gives NULL
>>>
>>> If we really go this way, I suggest to cover all these scenarios too:
>>>
>>> SELECT FROM_BASE64('TWFyaWE=# comment1');
>>> SELECT FROM_BASE64('TWFyaWE=\n# comment1');
>>> SELECT FROM_BASE64('TWFyaWE=\n# comment1\nTWFyaWE=');
>>> SELECT FROM_BASE64('TWFyaWE=\n# comment1\nTWFyaWE=\n# comment2\n');
>>>
>>> They all should give NULL.
>> They all are giving null
>>>
>>>>> Thanks.
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~maria-developers
>>>>> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~maria-developers
>>>>> More help : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>>
>>
>>
>>
Follow ups
References