← Back to team overview

maria-developers team mailing list archive

Re: Correct parsing of BINLOG 'XXX' with comment inside

 

Hi Andrei,

We need your help on:

MDEV-10362 mysqlbinlog verbose output cannot be decoded


The problem is that "mysqlbinlog -vvv" writes
comments right inside the BINLOG statement,
between base64 chunks:


BINLOG '
kxiEVxMBAAAALQAAAMUBAAAAABMAAAAAAAEABHRlc3QAAnQxAAMIDwMCIAAG
...
AAAAAAAAAQAAAPo/AAAAAAAAAAAAAAD6QAAAAAAAAAABAAAA+kAAAAAAAAAAAAAAAA==
### UPDATE `test`.`t1`
### WHERE
###   @1=1 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=NULL /* LONGINT meta=32 nullable=1 is_null=1 */
###   @3=1 /* INT meta=0 nullable=1 is_null=0 */
AAAAAAD63QUAAAAAAAABAAAA+t0FAAAAAAAAAAAAAPreBQAAAAAAAAEAAAD63gUAAAAAAAAAAAAA
'/*!*/;


So when later we try to load this output to "mysql", it fails,
because when mysql_client_binlog_statement() calls base64_decode(),
the latter fails as it does not expect any comments.

Sachin proposes to add a new flag to base64_decode(), to make
skip comments when this flags is on.

I'm not sure that we should do this.

Would it be possible to fix mysqlbinlog instead, to print:
- a BINLOG statement with only base64 data, without any comments inside
- followed by comments, fully outside of the BINLOG statement.
?

Or even, should not we split such BINLOG statement with multiple
base64 chunks into individual BINLOG statements?


Thanks.


On 10/16/2018 04:03 PM, Sachin Setiya wrote:
> Hi Bar!
> On Tue, Oct 16, 2018 at 7:33 AM Alexander Barkov <bar@xxxxxxxxxxx> wrote:
>>
>> Hi Sachin,
>>
>>
>> On 10/16/2018 03:49 AM, Sachin Setiya wrote:
>>> Hi Bar!
>>> On Tue, Sep 11, 2018 at 1:47 PM Alexander Barkov <bar@xxxxxxxxxxx> wrote:
>>>>
>>>> Hi Sachin,
>>>>
>>>>
>>>> On 09/10/2018 03:47 PM, Sachin Setia wrote:
>>>>> Hi Bar,
>>>>>
>>>>> Currently if we have generated the sql file using mysqlbinlog -v and
>>>>> if we have big statement binlog
>>>>> , then mysqlbinlog add comment in between of Binlog 'XXX' statement ,
>>>>> but unfortunately base64_decode
>>>>> does not understands it, and it through error since # is not a base64 char
>>>>>
>>>>> This patches solves this.
>>>>
>>>> Note, base64_decode() is used in at least two places:
>>>> - for the binary log
>>>> - for the SQL function FROM_BASE64()
>>>>
>>>>
>>>> I don't like that your patch changes the behavior of the SQL function
>>>> FROM_BASE64(). It should not recognize any '#' inside the base64 data
>>>> as comments.
>>> Right , So I have creates a flag MY_BASE64_DECODE_ALLOW_COMMENTS it is
>>> kind of similar to as MY_BASE64_DECODE_ALLOW_MULTIPLE_CHUNKS flag is used
>>> So no impact on FROM_BASE64
>>>
>>>>
>>>> So perhaps this should be fixed in some other place, not in base64_decode().
>>>>
>>>>
>>>> I checked the output of these commands:
>>>>
>>>> ./bin/mysqlbinlog      ./data/retsina-bin.000003
>>>> ./bin/mysqlbinlog -vvv ./data/retsina-bin.000003
>>>>
>>>> It looks suspicious for me.
>>>>
>>>> Can you please remind why mysqlbinlog prints multiple base64 chunks
>>>> inside the same BINLOG statement?
>>> It is beacuse if we have long row we can have more then one
>>> ROWS_LOG_EVENT after TABLE_MAP_EVENT
>>
>> Do you know where in the code the server decides to end
>> the current chunk and start a new one?
> Here In binlog_prepare_pending_rows_event
>   if (!pending ||
>       pending->server_id != serv_id ||
>       pending->get_table_id() != table->s->table_map_id ||
>       pending->get_general_type_code() != general_type_code ||
>       pending->get_data_size() + needed > opt_binlog_rows_event_max_size ||
>       ^^^ this line
>       pending->get_width() != colcnt ||
>       !bitmap_cmp(pending->get_cols(), cols))
>   {
> 
>>
>> Is there some limit (in bytes, or in number or records)?
>>
>>>>
>>>>
>>>> From my understanding, it is "mysqlbinlog -vvv" who should be fixed
>>>> to print good base64 data inside the string literal that follows the
>>>> BINLOG keyword.
>>>>
>>>> At least  the additional comments printed by -vvv should be outside of
>>>> the BINLOG statement (presumably before), not inside.
>>>>
>>> Idk , this can be lot of work , plus it will slow mysqlbinlog also ,
>>> And I think with new solution FROM_BASE64 behavior is not changed
>>> I have added test for FROM_BASE64
>>> #No change in BASE64_FROM behaviour
>>> SELECT FROM_BASE64('TWFy
>>> #Comment 344
>>> #
>>> aWE=') AS 'Output';
>>> this will generate Maria in earlier patch which is wrong , with new
>>> patch it gives NULL
>>
>> If we really go this way, I suggest to cover all these scenarios too:
>>
>> SELECT FROM_BASE64('TWFyaWE=# comment1');
>> SELECT FROM_BASE64('TWFyaWE=\n# comment1');
>> SELECT FROM_BASE64('TWFyaWE=\n# comment1\nTWFyaWE=');
>> SELECT FROM_BASE64('TWFyaWE=\n# comment1\nTWFyaWE=\n# comment2\n');
>>
>> They all should give NULL.
> They all are giving null
>>
>>>> Thanks.
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~maria-developers
>>>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~maria-developers
>>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>>
> 
> 
> 


Follow ups

References