maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #06307
Re: Review of base64.diff and german2.diff
Guys,
i think it's important to explain that we use RFC 2045 maybe at DOCS and
maybe at SOURCE too
just to make things more clear, php users and others users that only know
about base64_encode/decode, don't know about many RFCs for each case...
2013/9/23 Roberto Spadim <roberto@xxxxxxxxxxxxx>
> Just one information for everybody it's a veeeeery verrrrry old bug at
> mysql reported by me :)
>
> http://bugs.mysql.com/bug.php?id=18861
> Add base64_encode, base64_decode functions
> 6 Apr 2006 18:00
>
> thanks mariadb team! =) 7 years old bug resolved =)
>
>
>
> 2013/9/23 Roberto Spadim <roberto@xxxxxxxxxxxxx>
>
>> thanks alexander! :D i will read and test it as soon as possible :)
>>
>>
>> 2013/9/23 Alexander Barkov <bar@xxxxxxxxxxx>
>>
>>> Hi Roberto,
>>>
>>> the patch has been pushed to the 10.0 tree:
>>>
>>> lp:~maria-captains/maria/10.0
>>>
>>>
>>> See here for more details about the 10.0 tree.
>>> https://launchpad.net/maria/**10.0 <https://launchpad.net/maria/10.0>
>>>
>>>
>>> On 09/17/2013 07:42 PM, Roberto Spadim wrote:
>>>
>>>> nice =)
>>>> after this patch commited, could report where i could get it? i will try
>>>> to rewrite to base32 if possible :) and send at jira
>>>> thanks!
>>>>
>>>>
>>>> 2013/9/17 Alexander Barkov <bar@xxxxxxxxxxx <mailto:bar@xxxxxxxxxxx>>
>>>>
>>>>
>>>> Hello Roberto,
>>>>
>>>>
>>>>
>>>> On 09/17/2013 06:38 PM, Roberto Spadim wrote:
>>>>
>>>> hi guys!
>>>> one question, as a mariadb user...
>>>> base64 will be exposed to sql layer? could i use TO_BASE64() and
>>>> FROM_BASE64() at mysql client?
>>>> there's a MDEV in JIRA about it, if you could attach it to this
>>>> patch
>>>> could be nice: MDEV-4387
>>>> <https://mariadb.atlassian.__**net/browse/MDEV-4387
>>>>
>>>> <https://mariadb.atlassian.**net/browse/MDEV-4387<https://mariadb.atlassian.net/browse/MDEV-4387>
>>>> >>
>>>>
>>>>
>>>> and with time, implement base32 and base16: MDEV-4800
>>>> <https://mariadb.atlassian.__**net/browse/MDEV-4800
>>>>
>>>> <https://mariadb.atlassian.**net/browse/MDEV-4800<https://mariadb.atlassian.net/browse/MDEV-4800>
>>>> >>
>>>>
>>>>
>>>> base64 is very usefull, and many php users use it to
>>>> "send/receive"
>>>> files from database
>>>> base32 from what i know is usefull only for OTP and other
>>>> password apps
>>>> base16 is a hexadecimal function HEX() and UNHEX()
>>>>
>>>>
>>>> Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level,
>>>> so one case use for example things like this:
>>>>
>>>> INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))**__;
>>>>
>>>>
>>>> or
>>>>
>>>> SELECT TO_BASE64(column) FROM t1;
>>>>
>>>>
>>>> There are no plans to implement base32.
>>>>
>>>>
>>>> thanks guys
>>>>
>>>>
>>>> 2013/9/17 Alexander Barkov <bar@xxxxxxxxxxx
>>>> <mailto:bar@xxxxxxxxxxx> <mailto:bar@xxxxxxxxxxx
>>>>
>>>> <mailto:bar@xxxxxxxxxxx>>>
>>>>
>>>>
>>>> Hi Monty,
>>>>
>>>> thanks for review.
>>>>
>>>> I have addressed most of your suggestions. See the new
>>>> version attached,
>>>> and the detailed comments inline:
>>>>
>>>>
>>>>
>>>> On 09/12/2013 04:32 PM, Michael Widenius wrote:
>>>>
>>>>
>>>> Hi!
>>>>
>>>> Here is the review for the code that we should put into
>>>> 10.0
>>>>
>>>> First the base64:
>>>>
>>>> === modified file 'mysql-test/t/func_str.test'
>>>> --- mysql-test/t/func_str.test 2013-05-07 11:05:09
>>>> +0000
>>>> +++ mysql-test/t/func_str.test 2013-08-28 13:14:24
>>>> +0000
>>>> @@ -1555,3 +1555,118 @@ drop table t1,t2;
>>>> --echo # End of 5.5 tests
>>>> --echo #
>>>>
>>>> +
>>>> +--echo #
>>>> +--echo # Start of 5.6 tests
>>>> +--echo #
>>>>
>>>>
>>>> Shouldn't this be start of 10.0 tests ?
>>>> (I know that this code is from MySQL 5.6, but still for
>>>> us this
>>>> is 10.0...)
>>>>
>>>>
>>>> This code is (almost) copy-and-paste from MySQL-5.6.
>>>> I think it's a good idea when looking inside a test file
>>>> to be able to see which tests are coming from MySQL-5.6,
>>>> and which tests are coming from MariaDB-10.0.
>>>>
>>>> I'd suggest to keep "Start of 5.6 tests" in this particular
>>>> case,
>>>> and also when merging the tests for the other merged
>>>> MySQL-5.6 features.
>>>>
>>>>
>>>>
>>>>
>>>> === modified file 'mysys/base64.c'
>>>> --- mysys/base64.c 2011-06-30 15:46:53 +0000
>>>> +++ mysys/base64.c 2013-03-09 06:22:59 +0000
>>>> @@ -1,5 +1,4 @@
>>>> -/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun
>>>> Microsystems,
>>>> Inc.
>>>> - Use is subject to license terms.
>>>> +/* Copyright (c) 2003, 2010, Oracle and/or its
>>>> affiliates.
>>>> All rights reserved.
>>>>
>>>>
>>>> Removed 'all rights reserved'. You can't have that for
>>>> GPL code.
>>>>
>>>> If there is any new code from us, please add a
>>>> copyright message
>>>> for the
>>>> MariaDB foundation too!
>>>>
>>>>
>>>> Removed 'all rights reserved' and added "MariaDB", as there
>>>> are some
>>>> our own changes.
>>>>
>>>>
>>>>
>>>>
>>>> This program is free software; you can
>>>> redistribute it
>>>> and/or modify
>>>> it under the terms of the GNU General Public
>>>> License
>>>> as published by
>>>> @@ -25,6 +24,28 @@ static char base64_table[] =
>>>> "ABCDEFGHIJ
>>>>
>>>> "abcdefghijklmnopqrstuvwxyz"
>>>> "0123456789+/";
>>>>
>>>> +/**
>>>> + * Maximum length base64_needed_encoded_length()
>>>> + * can handle without overflow.
>>>> + */
>>>> +int
>>>> +base64_encode_max_arg_length(**____)
>>>>
>>>>
>>>> +{
>>>> +#if (SIZEOF_INT == 8)
>>>> + /*
>>>> + 6827690988321067803 -> 9223372036854775805
>>>> + 6827690988321067804 -> -9223372036854775807
>>>> + */
>>>>
>>>>
>>>> Please describe from where the following number comes
>>>> from
>>>> (so that anyone can recalculate and check the number if
>>>> needed):
>>>>
>>>>
>>>> I added more comments where these numbers come from.
>>>>
>>>>
>>>>
>>>> + return 0x5EC0D4C77B03531BLL
>>>> +#else
>>>> + /*
>>>> + 1589695686 -> 2147483646 <tel:2147483646>
>>>> <tel:2147483646 <tel:2147483646>>
>>>>
>>>> + 1589695687 -> -2147483645 <tel:2147483645>
>>>> <tel:2147483645 <tel:2147483645>>
>>>>
>>>>
>>>> + */
>>>> + return 0x5EC0D4C6;
>>>> +#endif
>>>> +}
>>>> +
>>>>
>>>> int
>>>> base64_needed_encoded_length(_**___int
>>>> length_of_data)
>>>> @@ -39,6 +60,21 @@
>>>> base64_needed_encoded_length(_**___int length_
>>>>
>>>>
>>>> }
>>>>
>>>>
>>>> +/**
>>>> + * Maximum length base64_needed_decoded_length()
>>>> + * can handle without overflow.
>>>> + */
>>>> +int
>>>> +base64_decode_max_arg_length(**____)
>>>>
>>>>
>>>> +{
>>>>
>>>>
>>>> The following also need a comment.
>>>>
>>>>
>>>> Also added comments.
>>>>
>>>>
>>>>
>>>> +#if (SIZEOF_INT == 8)
>>>> + return 0x2AAAAAAAAAAAAAAALL;
>>>> +#else
>>>> + return 0x2AAAAAAA;
>>>> +#endif
>>>> +}
>>>> +
>>>> +
>>>>
>>>>
>>>> <cut>
>>>>
>>>> +/**
>>>> + * Get the next character from a base64 encoded
>>>> stream.
>>>> + * Skip spaces, then scan the next base64
>>>> character or a
>>>> pad character
>>>> + * and collect bits into decoder->c.
>>>> + *
>>>> + * @param decoder Pointer to MY_BASE64_DECODER
>>>> + * @return
>>>> + * FALSE on success (a valid base64 encoding
>>>> character found)
>>>> + * TRUE on error (unexpected character or
>>>> unexpected
>>>> end-of-input found)
>>>> + */
>>>>
>>>>
>>>> Add an empty line here.
>>>>
>>>>
>>>> Done.
>>>>
>>>>
>>>>
>>>> It would also be good to have a good description of the
>>>> base64
>>>> format
>>>> in this file so that one can understand what the
>>>> functions are
>>>> supposed to do (or at least a link to the
>>>> specification).
>>>>
>>>> Some thing I figured out by asking bar on IRC:
>>>>
>>>> - We never generate spaces
>>>> - When decoding, we allow spaces anywhere.
>>>> - The end of an 64 base decoded string is always
>>>> padded with '='
>>>> so that the final length is dividable with 4.
>>>>
>>>>
>>>> I added a reference to
>>>> http://en.wikipedia.org/wiki/_**___Base64<http://en.wikipedia.org/wiki/____Base64>
>>>> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
>>>> >
>>>>
>>>> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
>>>> <http://en.wikipedia.org/wiki/**Base64<http://en.wikipedia.org/wiki/Base64>
>>>> >>,
>>>>
>>>> as well all added these your comments in proper places of
>>>> the code.
>>>>
>>>>
>>>>
>>>>
>>>> +static inline my_bool
>>>> +my_base64_decoder_getch(MY___**__BASE64_DECODER
>>>>
>>>> *decoder)
>>>>
>>>>
>>>>
>>>> Remove inline from the above; It's likely to make the
>>>> code
>>>> slower, not
>>>> faster as this is a big function with lot of if's.
>>>>
>>>>
>>>> Removed "inline".
>>>>
>>>>
>>>>
>>>> {
>>>> - char b[3];
>>>> - size_t i= 0;
>>>> - char *dst_base= (char *)dst;
>>>> - char const *src= src_base;
>>>> - char *d= dst_base;
>>>> - size_t j;
>>>> + if (my_base64_decoder_skip_____**
>>>> spaces(decoder))
>>>>
>>>>
>>>> + return TRUE; /* End-of-input */
>>>>
>>>> - while (i < len)
>>>> + if (!my_base64_add(decoder)) /* Valid base64
>>>> character
>>>> found */
>>>> {
>>>> - unsigned c= 0;
>>>> - size_t mark= 0;
>>>> + if (decoder->mark)
>>>> + {
>>>> + /* If we have scanned '=' already, then only
>>>> '=' is
>>>> valid */
>>>> + DBUG_ASSERT(decoder->state == 3);
>>>> + decoder->error= 1;
>>>> + decoder->src--;
>>>> + return TRUE; /* expected '=', but encoding
>>>> character
>>>> found */
>>>> + }
>>>> + decoder->state++;
>>>> + return FALSE;
>>>> + }
>>>>
>>>>
>>>> If you want to have the funtion inline, then move the
>>>> following to
>>>> another not inline function; We don't need to generate
>>>> code for
>>>> this
>>>> for every function call:
>>>>
>>>> +
>>>> + /* Process error */
>>>> + switch (decoder->state)
>>>> + {
>>>> + case 0:
>>>> + case 1:
>>>> + decoder->src--;
>>>> + return TRUE; /* base64 character expected */
>>>> + break;
>>>> +
>>>> + case 2:
>>>> + case 3:
>>>> + if (decoder->src[-1] == '=')
>>>> + {
>>>> + decoder->error= 0; /* Not an error - it's a
>>>> pad
>>>> character */
>>>> + decoder->mark++;
>>>> + }
>>>> + else
>>>> + {
>>>> + decoder->src--;
>>>> + return TRUE; /* base64 character or '='
>>>> expected */
>>>> + }
>>>> + break;
>>>> + default:
>>>> + DBUG_ASSERT(0);
>>>> + return TRUE; /* Wrong state, should not
>>>> happen */
>>>> + }
>>>>
>>>>
>>>> <cut>
>>>>
>>>> Add to the comment for base64_decode that the 'dst'
>>>> buffer has to be
>>>> at least 2 character longer than what is needed to
>>>> decode src_base.
>>>>
>>>>
>>>> I added this comment:
>>>>
>>>> * Note: 'dst' must have sufficient space to store the
>>>> decoded data.
>>>> * Use base64_needed_decoded_length() to calculate the
>>>> correct space
>>>> size.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> +int
>>>> +base64_decode(const char *src_base, size_t len,
>>>> + void *dst, const char **end_ptr, int
>>>> flags)
>>>> +{
>>>> + if (my_base64_decoder_getch(&____**decoder)
>>>> ||
>>>> + my_base64_decoder_getch(&____**decoder)
>>>> ||
>>>> + my_base64_decoder_getch(&____**decoder)
>>>> ||
>>>> + my_base64_decoder_getch(&____**decoder))
>>>>
>>>>
>>>> + break;
>>>>
>>>>
>>>> Generating the error handling with a switch for every
>>>> of the
>>>> above is wasteful. See previous comment.
>>>>
>>>>
>>>> Right. my_base64_decoder_getch() is not inline anymore.
>>>>
>>>>
>>>>
>>>> + /* Return error if there are more non-space
>>>> characters */
>>>> + decoder.state= 0;
>>>> + if (!my_base64_decoder_skip_____**
>>>> spaces(&decoder))
>>>>
>>>>
>>>> + decoder.error= 1;
>>>> +
>>>> if (end_ptr != NULL)
>>>>
>>>>
>>>> Note that base64_needed_decoded_length() used ceil()
>>>> when it's
>>>> not needed.
>>>>
>>>>
>>>> Removed.
>>>>
>>>>
>>>>
>>>>
>>>> <cut>
>>>>
>>>> === modified file 'sql/item_strfunc.cc'
>>>> @@ -451,6 +452,101 @@ void
>>>> Item_func_aes_decrypt::fix____**_length_a
>>>> set_persist_maybe_null(1);
>>>> }
>>>>
>>>> +
>>>> +void Item_func_to_base64::fix_____**
>>>> length_and_dec()
>>>>
>>>>
>>>> +{
>>>> + maybe_null= args[0]->maybe_null;
>>>> + collation.set(default_charset(**____),
>>>>
>>>> DERIVATION_COERCIBLE,
>>>>
>>>> MY_REPERTOIRE_ASCII);
>>>>
>>>>
>>>> Maybe better to cast both arguments to (ulonglong) as
>>>> the rest
>>>> of the code
>>>> is depending on this.
>>>>
>>>> Also, why is base64_encode_max_arg_length() int instead
>>>> of uint or
>>>> even better size_t?
>>>>
>>>>
>>>> This is because base64_encode() and base64_decode() are
>>>> used
>>>> in the replication code using "int" as return value.
>>>>
>>>> When exposing TO_BASE64() and FROM_BASE64() to the SQL
>>>> level,
>>>> I did not want to touch the replication code.
>>>>
>>>>
>>>>
>>>> + if (args[0]->max_length > (uint)
>>>> base64_encode_max_arg_length()**____)
>>>> + {
>>>> + maybe_null= 1;
>>>> + fix_char_length_ulonglong((___**_ulonglong)
>>>> base64_encode_max_arg_length()**____);
>>>> + }
>>>> + else
>>>> + {
>>>> + int length= base64_needed_encoded_length((**
>>>> ____int)
>>>>
>>>>
>>>> args[0]->max_length);
>>>> + DBUG_ASSERT(length > 0);
>>>>
>>>>
>>>> Don't think assert is needed as the function gurantees
>>>> it already.
>>>>
>>>> + fix_char_length_ulonglong((___**_ulonglong)
>>>>
>>>> length - 1);
>>>> + }
>>>> +}
>>>>
>>>>
>>>> <cut>
>>>>
>>>> +String *Item_func_from_base64::val___**
>>>> __str(String
>>>>
>>>> *str)
>>>>
>>>> +{
>>>> + String *res= args[0]->val_str_ascii(str);
>>>> + bool too_long= false;
>>>> + int length;
>>>> + const char *end_ptr;
>>>> +
>>>> + if (!res ||
>>>> + res->length() > (uint)
>>>> base64_decode_max_arg_length() ||
>>>> + (too_long=
>>>> + ((uint) (length=
>>>> base64_needed_decoded_length((**____int)
>>>> res->length())) >
>>>> +
>>>> current_thd->variables.max____**_allowed_packet)) ||
>>>>
>>>>
>>>> + tmp_value.alloc((uint) length) ||
>>>> + (length= base64_decode(res->ptr(), (int)
>>>> res->length(),
>>>> + (char *)
>>>> tmp_value.ptr(),
>>>> &end_ptr, 0)) < 0 ||
>>>> + end_ptr < res->ptr() + res->length())
>>>> + {
>>>>
>>>>
>>>> Shouldn't we get a readable error or warning if the
>>>> string
>>>> contained wrong
>>>> characters?
>>>> Now it looks like we will just return null.
>>>>
>>>> Something like 'Malformed base64 string. Error at
>>>> position %d" would
>>>> be nice.
>>>>
>>>>
>>>> Good idea. I have added a warning. Now it looks clear what
>>>> went wrong
>>>> if FROM_BASE64() returns NULL.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> + null_value= 1; // NULL input, too long input,
>>>> OOM, or
>>>> badly formed input
>>>> + if (too_long)
>>>> + {
>>>> + push_warning_printf(current___**__thd,
>>>> Sql_condition::WARN_LEVEL_____**WARN,
>>>> +
>>>> ER_WARN_ALLOWED_PACKET_____**OVERFLOWED,
>>>> +
>>>> ER(ER_WARN_ALLOWED_PACKET_____**OVERFLOWED),
>>>> func_name(),
>>>> +
>>>> current_thd->variables.max____**
>>>> _allowed_packet);
>>>>
>>>>
>>>> + }
>>>> + return 0;
>>>> + }
>>>> + tmp_value.length((uint) length);
>>>> + null_value= 0;
>>>> + return &tmp_value;
>>>> +}
>>>>
>>>>
>>>> <cut>
>>>>
>>>> <cut>
>>>>
>>>>
>>>> ______________________________**___________________
>>>> Mailing list: https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers>
>>>> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
>>>> >
>>>> Post to : maria-developers@lists.__launc**hpad.net<http://launchpad.net>
>>>> <mailto:maria-developers@**lists.launchpad.net<maria-developers@xxxxxxxxxxxxxxxxxxx>
>>>> >
>>>> <mailto:maria-developers@__lis**ts.launchpad.net<http://lists.launchpad.net>
>>>> <mailto:maria-developers@**lists.launchpad.net<maria-developers@xxxxxxxxxxxxxxxxxxx>
>>>> >>
>>>>
>>>> Unsubscribe : https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers>
>>>> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
>>>> >
>>>> More help : https://help.launchpad.net/__**ListHelp<https://help.launchpad.net/__ListHelp>
>>>>
>>>> <https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
>>>> >
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Roberto Spadim
>>>> SPAEmpresarial
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Roberto Spadim
>>>> SPAEmpresarial
>>>>
>>>
>>
>>
>> --
>> Roberto Spadim
>> SPAEmpresarial
>>
>
>
>
> --
> Roberto Spadim
> SPAEmpresarial
>
--
Roberto Spadim
SPAEmpresarial
Follow ups
References