maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #06304
Re: Review of base64.diff and german2.diff
Just one information for everybody it's a veeeeery verrrrry old bug at
mysql reported by me :)
http://bugs.mysql.com/bug.php?id=18861
Add base64_encode, base64_decode functions
6 Apr 2006 18:00
thanks mariadb team! =) 7 years old bug resolved =)
2013/9/23 Roberto Spadim <roberto@xxxxxxxxxxxxx>
> thanks alexander! :D i will read and test it as soon as possible :)
>
>
> 2013/9/23 Alexander Barkov <bar@xxxxxxxxxxx>
>
>> Hi Roberto,
>>
>> the patch has been pushed to the 10.0 tree:
>>
>> lp:~maria-captains/maria/10.0
>>
>>
>> See here for more details about the 10.0 tree.
>> https://launchpad.net/maria/**10.0 <https://launchpad.net/maria/10.0>
>>
>>
>> On 09/17/2013 07:42 PM, Roberto Spadim wrote:
>>
>>> nice =)
>>> after this patch commited, could report where i could get it? i will try
>>> to rewrite to base32 if possible :) and send at jira
>>> thanks!
>>>
>>>
>>> 2013/9/17 Alexander Barkov <bar@xxxxxxxxxxx <mailto:bar@xxxxxxxxxxx>>
>>>
>>>
>>> Hello Roberto,
>>>
>>>
>>>
>>> On 09/17/2013 06:38 PM, Roberto Spadim wrote:
>>>
>>> hi guys!
>>> one question, as a mariadb user...
>>> base64 will be exposed to sql layer? could i use TO_BASE64() and
>>> FROM_BASE64() at mysql client?
>>> there's a MDEV in JIRA about it, if you could attach it to this
>>> patch
>>> could be nice: MDEV-4387
>>> <https://mariadb.atlassian.__**net/browse/MDEV-4387
>>>
>>> <https://mariadb.atlassian.**net/browse/MDEV-4387<https://mariadb.atlassian.net/browse/MDEV-4387>
>>> >>
>>>
>>>
>>> and with time, implement base32 and base16: MDEV-4800
>>> <https://mariadb.atlassian.__**net/browse/MDEV-4800
>>>
>>> <https://mariadb.atlassian.**net/browse/MDEV-4800<https://mariadb.atlassian.net/browse/MDEV-4800>
>>> >>
>>>
>>>
>>> base64 is very usefull, and many php users use it to
>>> "send/receive"
>>> files from database
>>> base32 from what i know is usefull only for OTP and other
>>> password apps
>>> base16 is a hexadecimal function HEX() and UNHEX()
>>>
>>>
>>> Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level,
>>> so one case use for example things like this:
>>>
>>> INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))**__;
>>>
>>>
>>> or
>>>
>>> SELECT TO_BASE64(column) FROM t1;
>>>
>>>
>>> There are no plans to implement base32.
>>>
>>>
>>> thanks guys
>>>
>>>
>>> 2013/9/17 Alexander Barkov <bar@xxxxxxxxxxx
>>> <mailto:bar@xxxxxxxxxxx> <mailto:bar@xxxxxxxxxxx
>>>
>>> <mailto:bar@xxxxxxxxxxx>>>
>>>
>>>
>>> Hi Monty,
>>>
>>> thanks for review.
>>>
>>> I have addressed most of your suggestions. See the new
>>> version attached,
>>> and the detailed comments inline:
>>>
>>>
>>>
>>> On 09/12/2013 04:32 PM, Michael Widenius wrote:
>>>
>>>
>>> Hi!
>>>
>>> Here is the review for the code that we should put into
>>> 10.0
>>>
>>> First the base64:
>>>
>>> === modified file 'mysql-test/t/func_str.test'
>>> --- mysql-test/t/func_str.test 2013-05-07 11:05:09
>>> +0000
>>> +++ mysql-test/t/func_str.test 2013-08-28 13:14:24
>>> +0000
>>> @@ -1555,3 +1555,118 @@ drop table t1,t2;
>>> --echo # End of 5.5 tests
>>> --echo #
>>>
>>> +
>>> +--echo #
>>> +--echo # Start of 5.6 tests
>>> +--echo #
>>>
>>>
>>> Shouldn't this be start of 10.0 tests ?
>>> (I know that this code is from MySQL 5.6, but still for
>>> us this
>>> is 10.0...)
>>>
>>>
>>> This code is (almost) copy-and-paste from MySQL-5.6.
>>> I think it's a good idea when looking inside a test file
>>> to be able to see which tests are coming from MySQL-5.6,
>>> and which tests are coming from MariaDB-10.0.
>>>
>>> I'd suggest to keep "Start of 5.6 tests" in this particular
>>> case,
>>> and also when merging the tests for the other merged
>>> MySQL-5.6 features.
>>>
>>>
>>>
>>>
>>> === modified file 'mysys/base64.c'
>>> --- mysys/base64.c 2011-06-30 15:46:53 +0000
>>> +++ mysys/base64.c 2013-03-09 06:22:59 +0000
>>> @@ -1,5 +1,4 @@
>>> -/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun
>>> Microsystems,
>>> Inc.
>>> - Use is subject to license terms.
>>> +/* Copyright (c) 2003, 2010, Oracle and/or its
>>> affiliates.
>>> All rights reserved.
>>>
>>>
>>> Removed 'all rights reserved'. You can't have that for
>>> GPL code.
>>>
>>> If there is any new code from us, please add a
>>> copyright message
>>> for the
>>> MariaDB foundation too!
>>>
>>>
>>> Removed 'all rights reserved' and added "MariaDB", as there
>>> are some
>>> our own changes.
>>>
>>>
>>>
>>>
>>> This program is free software; you can
>>> redistribute it
>>> and/or modify
>>> it under the terms of the GNU General Public
>>> License
>>> as published by
>>> @@ -25,6 +24,28 @@ static char base64_table[] =
>>> "ABCDEFGHIJ
>>>
>>> "abcdefghijklmnopqrstuvwxyz"
>>> "0123456789+/";
>>>
>>> +/**
>>> + * Maximum length base64_needed_encoded_length()
>>> + * can handle without overflow.
>>> + */
>>> +int
>>> +base64_encode_max_arg_length(**____)
>>>
>>>
>>> +{
>>> +#if (SIZEOF_INT == 8)
>>> + /*
>>> + 6827690988321067803 -> 9223372036854775805
>>> + 6827690988321067804 -> -9223372036854775807
>>> + */
>>>
>>>
>>> Please describe from where the following number comes
>>> from
>>> (so that anyone can recalculate and check the number if
>>> needed):
>>>
>>>
>>> I added more comments where these numbers come from.
>>>
>>>
>>>
>>> + return 0x5EC0D4C77B03531BLL
>>> +#else
>>> + /*
>>> + 1589695686 -> 2147483646 <tel:2147483646>
>>> <tel:2147483646 <tel:2147483646>>
>>>
>>> + 1589695687 -> -2147483645 <tel:2147483645>
>>> <tel:2147483645 <tel:2147483645>>
>>>
>>>
>>> + */
>>> + return 0x5EC0D4C6;
>>> +#endif
>>> +}
>>> +
>>>
>>> int
>>> base64_needed_encoded_length(_**___int
>>> length_of_data)
>>> @@ -39,6 +60,21 @@
>>> base64_needed_encoded_length(_**___int length_
>>>
>>>
>>> }
>>>
>>>
>>> +/**
>>> + * Maximum length base64_needed_decoded_length()
>>> + * can handle without overflow.
>>> + */
>>> +int
>>> +base64_decode_max_arg_length(**____)
>>>
>>>
>>> +{
>>>
>>>
>>> The following also need a comment.
>>>
>>>
>>> Also added comments.
>>>
>>>
>>>
>>> +#if (SIZEOF_INT == 8)
>>> + return 0x2AAAAAAAAAAAAAAALL;
>>> +#else
>>> + return 0x2AAAAAAA;
>>> +#endif
>>> +}
>>> +
>>> +
>>>
>>>
>>> <cut>
>>>
>>> +/**
>>> + * Get the next character from a base64 encoded
>>> stream.
>>> + * Skip spaces, then scan the next base64
>>> character or a
>>> pad character
>>> + * and collect bits into decoder->c.
>>> + *
>>> + * @param decoder Pointer to MY_BASE64_DECODER
>>> + * @return
>>> + * FALSE on success (a valid base64 encoding
>>> character found)
>>> + * TRUE on error (unexpected character or
>>> unexpected
>>> end-of-input found)
>>> + */
>>>
>>>
>>> Add an empty line here.
>>>
>>>
>>> Done.
>>>
>>>
>>>
>>> It would also be good to have a good description of the
>>> base64
>>> format
>>> in this file so that one can understand what the
>>> functions are
>>> supposed to do (or at least a link to the
>>> specification).
>>>
>>> Some thing I figured out by asking bar on IRC:
>>>
>>> - We never generate spaces
>>> - When decoding, we allow spaces anywhere.
>>> - The end of an 64 base decoded string is always
>>> padded with '='
>>> so that the final length is dividable with 4.
>>>
>>>
>>> I added a reference to
>>> http://en.wikipedia.org/wiki/_**___Base64<http://en.wikipedia.org/wiki/____Base64>
>>> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
>>> >
>>>
>>> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
>>> <http://en.wikipedia.org/wiki/**Base64<http://en.wikipedia.org/wiki/Base64>
>>> >>,
>>>
>>> as well all added these your comments in proper places of
>>> the code.
>>>
>>>
>>>
>>>
>>> +static inline my_bool
>>> +my_base64_decoder_getch(MY___**__BASE64_DECODER
>>>
>>> *decoder)
>>>
>>>
>>>
>>> Remove inline from the above; It's likely to make the
>>> code
>>> slower, not
>>> faster as this is a big function with lot of if's.
>>>
>>>
>>> Removed "inline".
>>>
>>>
>>>
>>> {
>>> - char b[3];
>>> - size_t i= 0;
>>> - char *dst_base= (char *)dst;
>>> - char const *src= src_base;
>>> - char *d= dst_base;
>>> - size_t j;
>>> + if (my_base64_decoder_skip_____**
>>> spaces(decoder))
>>>
>>>
>>> + return TRUE; /* End-of-input */
>>>
>>> - while (i < len)
>>> + if (!my_base64_add(decoder)) /* Valid base64
>>> character
>>> found */
>>> {
>>> - unsigned c= 0;
>>> - size_t mark= 0;
>>> + if (decoder->mark)
>>> + {
>>> + /* If we have scanned '=' already, then only
>>> '=' is
>>> valid */
>>> + DBUG_ASSERT(decoder->state == 3);
>>> + decoder->error= 1;
>>> + decoder->src--;
>>> + return TRUE; /* expected '=', but encoding
>>> character
>>> found */
>>> + }
>>> + decoder->state++;
>>> + return FALSE;
>>> + }
>>>
>>>
>>> If you want to have the funtion inline, then move the
>>> following to
>>> another not inline function; We don't need to generate
>>> code for
>>> this
>>> for every function call:
>>>
>>> +
>>> + /* Process error */
>>> + switch (decoder->state)
>>> + {
>>> + case 0:
>>> + case 1:
>>> + decoder->src--;
>>> + return TRUE; /* base64 character expected */
>>> + break;
>>> +
>>> + case 2:
>>> + case 3:
>>> + if (decoder->src[-1] == '=')
>>> + {
>>> + decoder->error= 0; /* Not an error - it's a
>>> pad
>>> character */
>>> + decoder->mark++;
>>> + }
>>> + else
>>> + {
>>> + decoder->src--;
>>> + return TRUE; /* base64 character or '='
>>> expected */
>>> + }
>>> + break;
>>> + default:
>>> + DBUG_ASSERT(0);
>>> + return TRUE; /* Wrong state, should not happen
>>> */
>>> + }
>>>
>>>
>>> <cut>
>>>
>>> Add to the comment for base64_decode that the 'dst'
>>> buffer has to be
>>> at least 2 character longer than what is needed to
>>> decode src_base.
>>>
>>>
>>> I added this comment:
>>>
>>> * Note: 'dst' must have sufficient space to store the
>>> decoded data.
>>> * Use base64_needed_decoded_length() to calculate the
>>> correct space
>>> size.
>>>
>>>
>>>
>>>
>>>
>>>
>>> +int
>>> +base64_decode(const char *src_base, size_t len,
>>> + void *dst, const char **end_ptr, int
>>> flags)
>>> +{
>>> + if (my_base64_decoder_getch(&____**decoder) ||
>>> + my_base64_decoder_getch(&____**decoder) ||
>>> + my_base64_decoder_getch(&____**decoder) ||
>>> + my_base64_decoder_getch(&____**decoder))
>>>
>>>
>>> + break;
>>>
>>>
>>> Generating the error handling with a switch for every
>>> of the
>>> above is wasteful. See previous comment.
>>>
>>>
>>> Right. my_base64_decoder_getch() is not inline anymore.
>>>
>>>
>>>
>>> + /* Return error if there are more non-space
>>> characters */
>>> + decoder.state= 0;
>>> + if (!my_base64_decoder_skip_____**
>>> spaces(&decoder))
>>>
>>>
>>> + decoder.error= 1;
>>> +
>>> if (end_ptr != NULL)
>>>
>>>
>>> Note that base64_needed_decoded_length() used ceil()
>>> when it's
>>> not needed.
>>>
>>>
>>> Removed.
>>>
>>>
>>>
>>>
>>> <cut>
>>>
>>> === modified file 'sql/item_strfunc.cc'
>>> @@ -451,6 +452,101 @@ void
>>> Item_func_aes_decrypt::fix____**_length_a
>>> set_persist_maybe_null(1);
>>> }
>>>
>>> +
>>> +void Item_func_to_base64::fix_____**
>>> length_and_dec()
>>>
>>>
>>> +{
>>> + maybe_null= args[0]->maybe_null;
>>> + collation.set(default_charset(**____),
>>>
>>> DERIVATION_COERCIBLE,
>>>
>>> MY_REPERTOIRE_ASCII);
>>>
>>>
>>> Maybe better to cast both arguments to (ulonglong) as
>>> the rest
>>> of the code
>>> is depending on this.
>>>
>>> Also, why is base64_encode_max_arg_length() int instead
>>> of uint or
>>> even better size_t?
>>>
>>>
>>> This is because base64_encode() and base64_decode() are used
>>> in the replication code using "int" as return value.
>>>
>>> When exposing TO_BASE64() and FROM_BASE64() to the SQL
>>> level,
>>> I did not want to touch the replication code.
>>>
>>>
>>>
>>> + if (args[0]->max_length > (uint)
>>> base64_encode_max_arg_length()**____)
>>> + {
>>> + maybe_null= 1;
>>> + fix_char_length_ulonglong((___**_ulonglong)
>>> base64_encode_max_arg_length()**____);
>>> + }
>>> + else
>>> + {
>>> + int length= base64_needed_encoded_length((**
>>> ____int)
>>>
>>>
>>> args[0]->max_length);
>>> + DBUG_ASSERT(length > 0);
>>>
>>>
>>> Don't think assert is needed as the function gurantees
>>> it already.
>>>
>>> + fix_char_length_ulonglong((___**_ulonglong)
>>>
>>> length - 1);
>>> + }
>>> +}
>>>
>>>
>>> <cut>
>>>
>>> +String *Item_func_from_base64::val___**
>>> __str(String
>>>
>>> *str)
>>>
>>> +{
>>> + String *res= args[0]->val_str_ascii(str);
>>> + bool too_long= false;
>>> + int length;
>>> + const char *end_ptr;
>>> +
>>> + if (!res ||
>>> + res->length() > (uint)
>>> base64_decode_max_arg_length() ||
>>> + (too_long=
>>> + ((uint) (length=
>>> base64_needed_decoded_length((**____int)
>>> res->length())) >
>>> +
>>> current_thd->variables.max____**_allowed_packet)) ||
>>>
>>>
>>> + tmp_value.alloc((uint) length) ||
>>> + (length= base64_decode(res->ptr(), (int)
>>> res->length(),
>>> + (char *)
>>> tmp_value.ptr(),
>>> &end_ptr, 0)) < 0 ||
>>> + end_ptr < res->ptr() + res->length())
>>> + {
>>>
>>>
>>> Shouldn't we get a readable error or warning if the
>>> string
>>> contained wrong
>>> characters?
>>> Now it looks like we will just return null.
>>>
>>> Something like 'Malformed base64 string. Error at
>>> position %d" would
>>> be nice.
>>>
>>>
>>> Good idea. I have added a warning. Now it looks clear what
>>> went wrong
>>> if FROM_BASE64() returns NULL.
>>>
>>>
>>>
>>>
>>>
>>> + null_value= 1; // NULL input, too long input,
>>> OOM, or
>>> badly formed input
>>> + if (too_long)
>>> + {
>>> + push_warning_printf(current___**__thd,
>>> Sql_condition::WARN_LEVEL_____**WARN,
>>> +
>>> ER_WARN_ALLOWED_PACKET_____**OVERFLOWED,
>>> +
>>> ER(ER_WARN_ALLOWED_PACKET_____**OVERFLOWED),
>>> func_name(),
>>> +
>>> current_thd->variables.max____**_allowed_packet);
>>>
>>>
>>> + }
>>> + return 0;
>>> + }
>>> + tmp_value.length((uint) length);
>>> + null_value= 0;
>>> + return &tmp_value;
>>> +}
>>>
>>>
>>> <cut>
>>>
>>> <cut>
>>>
>>>
>>> ______________________________**___________________
>>> Mailing list: https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers>
>>> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
>>> >
>>> Post to : maria-developers@lists.__launc**hpad.net<http://launchpad.net>
>>> <mailto:maria-developers@**lists.launchpad.net<maria-developers@xxxxxxxxxxxxxxxxxxx>
>>> >
>>> <mailto:maria-developers@__lis**ts.launchpad.net<http://lists.launchpad.net>
>>> <mailto:maria-developers@**lists.launchpad.net<maria-developers@xxxxxxxxxxxxxxxxxxx>
>>> >>
>>>
>>> Unsubscribe : https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers>
>>> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
>>> >
>>> More help : https://help.launchpad.net/__**ListHelp<https://help.launchpad.net/__ListHelp>
>>>
>>> <https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
>>> >
>>>
>>>
>>>
>>>
>>> --
>>> Roberto Spadim
>>> SPAEmpresarial
>>>
>>>
>>>
>>>
>>> --
>>> Roberto Spadim
>>> SPAEmpresarial
>>>
>>
>
>
> --
> Roberto Spadim
> SPAEmpresarial
>
--
Roberto Spadim
SPAEmpresarial
Follow ups
References