maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #06303
Re: Review of base64.diff and german2.diff
thanks alexander! :D i will read and test it as soon as possible :)
2013/9/23 Alexander Barkov <bar@xxxxxxxxxxx>
> Hi Roberto,
>
> the patch has been pushed to the 10.0 tree:
>
> lp:~maria-captains/maria/10.0
>
>
> See here for more details about the 10.0 tree.
> https://launchpad.net/maria/**10.0 <https://launchpad.net/maria/10.0>
>
>
> On 09/17/2013 07:42 PM, Roberto Spadim wrote:
>
>> nice =)
>> after this patch commited, could report where i could get it? i will try
>> to rewrite to base32 if possible :) and send at jira
>> thanks!
>>
>>
>> 2013/9/17 Alexander Barkov <bar@xxxxxxxxxxx <mailto:bar@xxxxxxxxxxx>>
>>
>>
>> Hello Roberto,
>>
>>
>>
>> On 09/17/2013 06:38 PM, Roberto Spadim wrote:
>>
>> hi guys!
>> one question, as a mariadb user...
>> base64 will be exposed to sql layer? could i use TO_BASE64() and
>> FROM_BASE64() at mysql client?
>> there's a MDEV in JIRA about it, if you could attach it to this
>> patch
>> could be nice: MDEV-4387
>> <https://mariadb.atlassian.__**net/browse/MDEV-4387
>>
>> <https://mariadb.atlassian.**net/browse/MDEV-4387<https://mariadb.atlassian.net/browse/MDEV-4387>
>> >>
>>
>>
>> and with time, implement base32 and base16: MDEV-4800
>> <https://mariadb.atlassian.__**net/browse/MDEV-4800
>>
>> <https://mariadb.atlassian.**net/browse/MDEV-4800<https://mariadb.atlassian.net/browse/MDEV-4800>
>> >>
>>
>>
>> base64 is very usefull, and many php users use it to
>> "send/receive"
>> files from database
>> base32 from what i know is usefull only for OTP and other
>> password apps
>> base16 is a hexadecimal function HEX() and UNHEX()
>>
>>
>> Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level,
>> so one case use for example things like this:
>>
>> INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))**__;
>>
>>
>> or
>>
>> SELECT TO_BASE64(column) FROM t1;
>>
>>
>> There are no plans to implement base32.
>>
>>
>> thanks guys
>>
>>
>> 2013/9/17 Alexander Barkov <bar@xxxxxxxxxxx
>> <mailto:bar@xxxxxxxxxxx> <mailto:bar@xxxxxxxxxxx
>>
>> <mailto:bar@xxxxxxxxxxx>>>
>>
>>
>> Hi Monty,
>>
>> thanks for review.
>>
>> I have addressed most of your suggestions. See the new
>> version attached,
>> and the detailed comments inline:
>>
>>
>>
>> On 09/12/2013 04:32 PM, Michael Widenius wrote:
>>
>>
>> Hi!
>>
>> Here is the review for the code that we should put into
>> 10.0
>>
>> First the base64:
>>
>> === modified file 'mysql-test/t/func_str.test'
>> --- mysql-test/t/func_str.test 2013-05-07 11:05:09
>> +0000
>> +++ mysql-test/t/func_str.test 2013-08-28 13:14:24
>> +0000
>> @@ -1555,3 +1555,118 @@ drop table t1,t2;
>> --echo # End of 5.5 tests
>> --echo #
>>
>> +
>> +--echo #
>> +--echo # Start of 5.6 tests
>> +--echo #
>>
>>
>> Shouldn't this be start of 10.0 tests ?
>> (I know that this code is from MySQL 5.6, but still for
>> us this
>> is 10.0...)
>>
>>
>> This code is (almost) copy-and-paste from MySQL-5.6.
>> I think it's a good idea when looking inside a test file
>> to be able to see which tests are coming from MySQL-5.6,
>> and which tests are coming from MariaDB-10.0.
>>
>> I'd suggest to keep "Start of 5.6 tests" in this particular
>> case,
>> and also when merging the tests for the other merged
>> MySQL-5.6 features.
>>
>>
>>
>>
>> === modified file 'mysys/base64.c'
>> --- mysys/base64.c 2011-06-30 15:46:53 +0000
>> +++ mysys/base64.c 2013-03-09 06:22:59 +0000
>> @@ -1,5 +1,4 @@
>> -/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun
>> Microsystems,
>> Inc.
>> - Use is subject to license terms.
>> +/* Copyright (c) 2003, 2010, Oracle and/or its
>> affiliates.
>> All rights reserved.
>>
>>
>> Removed 'all rights reserved'. You can't have that for
>> GPL code.
>>
>> If there is any new code from us, please add a
>> copyright message
>> for the
>> MariaDB foundation too!
>>
>>
>> Removed 'all rights reserved' and added "MariaDB", as there
>> are some
>> our own changes.
>>
>>
>>
>>
>> This program is free software; you can
>> redistribute it
>> and/or modify
>> it under the terms of the GNU General Public
>> License
>> as published by
>> @@ -25,6 +24,28 @@ static char base64_table[] =
>> "ABCDEFGHIJ
>>
>> "abcdefghijklmnopqrstuvwxyz"
>> "0123456789+/";
>>
>> +/**
>> + * Maximum length base64_needed_encoded_length()
>> + * can handle without overflow.
>> + */
>> +int
>> +base64_encode_max_arg_length(**____)
>>
>>
>> +{
>> +#if (SIZEOF_INT == 8)
>> + /*
>> + 6827690988321067803 -> 9223372036854775805
>> + 6827690988321067804 -> -9223372036854775807
>> + */
>>
>>
>> Please describe from where the following number comes
>> from
>> (so that anyone can recalculate and check the number if
>> needed):
>>
>>
>> I added more comments where these numbers come from.
>>
>>
>>
>> + return 0x5EC0D4C77B03531BLL
>> +#else
>> + /*
>> + 1589695686 -> 2147483646 <tel:2147483646>
>> <tel:2147483646 <tel:2147483646>>
>>
>> + 1589695687 -> -2147483645 <tel:2147483645>
>> <tel:2147483645 <tel:2147483645>>
>>
>>
>> + */
>> + return 0x5EC0D4C6;
>> +#endif
>> +}
>> +
>>
>> int
>> base64_needed_encoded_length(_**___int
>> length_of_data)
>> @@ -39,6 +60,21 @@
>> base64_needed_encoded_length(_**___int length_
>>
>>
>> }
>>
>>
>> +/**
>> + * Maximum length base64_needed_decoded_length()
>> + * can handle without overflow.
>> + */
>> +int
>> +base64_decode_max_arg_length(**____)
>>
>>
>> +{
>>
>>
>> The following also need a comment.
>>
>>
>> Also added comments.
>>
>>
>>
>> +#if (SIZEOF_INT == 8)
>> + return 0x2AAAAAAAAAAAAAAALL;
>> +#else
>> + return 0x2AAAAAAA;
>> +#endif
>> +}
>> +
>> +
>>
>>
>> <cut>
>>
>> +/**
>> + * Get the next character from a base64 encoded
>> stream.
>> + * Skip spaces, then scan the next base64
>> character or a
>> pad character
>> + * and collect bits into decoder->c.
>> + *
>> + * @param decoder Pointer to MY_BASE64_DECODER
>> + * @return
>> + * FALSE on success (a valid base64 encoding
>> character found)
>> + * TRUE on error (unexpected character or
>> unexpected
>> end-of-input found)
>> + */
>>
>>
>> Add an empty line here.
>>
>>
>> Done.
>>
>>
>>
>> It would also be good to have a good description of the
>> base64
>> format
>> in this file so that one can understand what the
>> functions are
>> supposed to do (or at least a link to the specification).
>>
>> Some thing I figured out by asking bar on IRC:
>>
>> - We never generate spaces
>> - When decoding, we allow spaces anywhere.
>> - The end of an 64 base decoded string is always
>> padded with '='
>> so that the final length is dividable with 4.
>>
>>
>> I added a reference to
>> http://en.wikipedia.org/wiki/_**___Base64<http://en.wikipedia.org/wiki/____Base64>
>> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
>> >
>>
>> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
>> <http://en.wikipedia.org/wiki/**Base64<http://en.wikipedia.org/wiki/Base64>
>> >>,
>>
>> as well all added these your comments in proper places of
>> the code.
>>
>>
>>
>>
>> +static inline my_bool
>> +my_base64_decoder_getch(MY___**__BASE64_DECODER
>>
>> *decoder)
>>
>>
>>
>> Remove inline from the above; It's likely to make the
>> code
>> slower, not
>> faster as this is a big function with lot of if's.
>>
>>
>> Removed "inline".
>>
>>
>>
>> {
>> - char b[3];
>> - size_t i= 0;
>> - char *dst_base= (char *)dst;
>> - char const *src= src_base;
>> - char *d= dst_base;
>> - size_t j;
>> + if (my_base64_decoder_skip_____**spaces(decoder))
>>
>>
>> + return TRUE; /* End-of-input */
>>
>> - while (i < len)
>> + if (!my_base64_add(decoder)) /* Valid base64
>> character
>> found */
>> {
>> - unsigned c= 0;
>> - size_t mark= 0;
>> + if (decoder->mark)
>> + {
>> + /* If we have scanned '=' already, then only
>> '=' is
>> valid */
>> + DBUG_ASSERT(decoder->state == 3);
>> + decoder->error= 1;
>> + decoder->src--;
>> + return TRUE; /* expected '=', but encoding
>> character
>> found */
>> + }
>> + decoder->state++;
>> + return FALSE;
>> + }
>>
>>
>> If you want to have the funtion inline, then move the
>> following to
>> another not inline function; We don't need to generate
>> code for
>> this
>> for every function call:
>>
>> +
>> + /* Process error */
>> + switch (decoder->state)
>> + {
>> + case 0:
>> + case 1:
>> + decoder->src--;
>> + return TRUE; /* base64 character expected */
>> + break;
>> +
>> + case 2:
>> + case 3:
>> + if (decoder->src[-1] == '=')
>> + {
>> + decoder->error= 0; /* Not an error - it's a
>> pad
>> character */
>> + decoder->mark++;
>> + }
>> + else
>> + {
>> + decoder->src--;
>> + return TRUE; /* base64 character or '='
>> expected */
>> + }
>> + break;
>> + default:
>> + DBUG_ASSERT(0);
>> + return TRUE; /* Wrong state, should not happen
>> */
>> + }
>>
>>
>> <cut>
>>
>> Add to the comment for base64_decode that the 'dst'
>> buffer has to be
>> at least 2 character longer than what is needed to
>> decode src_base.
>>
>>
>> I added this comment:
>>
>> * Note: 'dst' must have sufficient space to store the
>> decoded data.
>> * Use base64_needed_decoded_length() to calculate the
>> correct space
>> size.
>>
>>
>>
>>
>>
>>
>> +int
>> +base64_decode(const char *src_base, size_t len,
>> + void *dst, const char **end_ptr, int
>> flags)
>> +{
>> + if (my_base64_decoder_getch(&____**decoder) ||
>> + my_base64_decoder_getch(&____**decoder) ||
>> + my_base64_decoder_getch(&____**decoder) ||
>> + my_base64_decoder_getch(&____**decoder))
>>
>>
>> + break;
>>
>>
>> Generating the error handling with a switch for every
>> of the
>> above is wasteful. See previous comment.
>>
>>
>> Right. my_base64_decoder_getch() is not inline anymore.
>>
>>
>>
>> + /* Return error if there are more non-space
>> characters */
>> + decoder.state= 0;
>> + if (!my_base64_decoder_skip_____**
>> spaces(&decoder))
>>
>>
>> + decoder.error= 1;
>> +
>> if (end_ptr != NULL)
>>
>>
>> Note that base64_needed_decoded_length() used ceil()
>> when it's
>> not needed.
>>
>>
>> Removed.
>>
>>
>>
>>
>> <cut>
>>
>> === modified file 'sql/item_strfunc.cc'
>> @@ -451,6 +452,101 @@ void
>> Item_func_aes_decrypt::fix____**_length_a
>> set_persist_maybe_null(1);
>> }
>>
>> +
>> +void Item_func_to_base64::fix_____**
>> length_and_dec()
>>
>>
>> +{
>> + maybe_null= args[0]->maybe_null;
>> + collation.set(default_charset(**____),
>>
>> DERIVATION_COERCIBLE,
>>
>> MY_REPERTOIRE_ASCII);
>>
>>
>> Maybe better to cast both arguments to (ulonglong) as
>> the rest
>> of the code
>> is depending on this.
>>
>> Also, why is base64_encode_max_arg_length() int instead
>> of uint or
>> even better size_t?
>>
>>
>> This is because base64_encode() and base64_decode() are used
>> in the replication code using "int" as return value.
>>
>> When exposing TO_BASE64() and FROM_BASE64() to the SQL level,
>> I did not want to touch the replication code.
>>
>>
>>
>> + if (args[0]->max_length > (uint)
>> base64_encode_max_arg_length()**____)
>> + {
>> + maybe_null= 1;
>> + fix_char_length_ulonglong((___**_ulonglong)
>> base64_encode_max_arg_length()**____);
>> + }
>> + else
>> + {
>> + int length= base64_needed_encoded_length((**
>> ____int)
>>
>>
>> args[0]->max_length);
>> + DBUG_ASSERT(length > 0);
>>
>>
>> Don't think assert is needed as the function gurantees
>> it already.
>>
>> + fix_char_length_ulonglong((___**_ulonglong)
>>
>> length - 1);
>> + }
>> +}
>>
>>
>> <cut>
>>
>> +String *Item_func_from_base64::val___**__str(String
>>
>> *str)
>>
>> +{
>> + String *res= args[0]->val_str_ascii(str);
>> + bool too_long= false;
>> + int length;
>> + const char *end_ptr;
>> +
>> + if (!res ||
>> + res->length() > (uint)
>> base64_decode_max_arg_length() ||
>> + (too_long=
>> + ((uint) (length=
>> base64_needed_decoded_length((**____int)
>> res->length())) >
>> +
>> current_thd->variables.max____**_allowed_packet)) ||
>>
>>
>> + tmp_value.alloc((uint) length) ||
>> + (length= base64_decode(res->ptr(), (int)
>> res->length(),
>> + (char *)
>> tmp_value.ptr(),
>> &end_ptr, 0)) < 0 ||
>> + end_ptr < res->ptr() + res->length())
>> + {
>>
>>
>> Shouldn't we get a readable error or warning if the
>> string
>> contained wrong
>> characters?
>> Now it looks like we will just return null.
>>
>> Something like 'Malformed base64 string. Error at
>> position %d" would
>> be nice.
>>
>>
>> Good idea. I have added a warning. Now it looks clear what
>> went wrong
>> if FROM_BASE64() returns NULL.
>>
>>
>>
>>
>>
>> + null_value= 1; // NULL input, too long input,
>> OOM, or
>> badly formed input
>> + if (too_long)
>> + {
>> + push_warning_printf(current___**__thd,
>> Sql_condition::WARN_LEVEL_____**WARN,
>> +
>> ER_WARN_ALLOWED_PACKET_____**OVERFLOWED,
>> +
>> ER(ER_WARN_ALLOWED_PACKET_____**OVERFLOWED),
>> func_name(),
>> +
>> current_thd->variables.max____**_allowed_packet);
>>
>>
>> + }
>> + return 0;
>> + }
>> + tmp_value.length((uint) length);
>> + null_value= 0;
>> + return &tmp_value;
>> +}
>>
>>
>> <cut>
>>
>> <cut>
>>
>>
>> ______________________________**___________________
>> Mailing list: https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers>
>> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
>> >
>> Post to : maria-developers@lists.__launc**hpad.net<http://launchpad.net>
>> <mailto:maria-developers@**lists.launchpad.net<maria-developers@xxxxxxxxxxxxxxxxxxx>
>> >
>> <mailto:maria-developers@__lis**ts.launchpad.net<http://lists.launchpad.net>
>> <mailto:maria-developers@**lists.launchpad.net<maria-developers@xxxxxxxxxxxxxxxxxxx>
>> >>
>>
>> Unsubscribe : https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers>
>> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
>> >
>> More help : https://help.launchpad.net/__**ListHelp<https://help.launchpad.net/__ListHelp>
>>
>> <https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
>> >
>>
>>
>>
>>
>> --
>> Roberto Spadim
>> SPAEmpresarial
>>
>>
>>
>>
>> --
>> Roberto Spadim
>> SPAEmpresarial
>>
>
--
Roberto Spadim
SPAEmpresarial
Follow ups
References