Thread Previous • Date Previous • Date Next • Thread Next |
On 09/17/2013 08:52 PM, Alexander Barkov wrote:
Hi Monty, On 09/17/2013 08:12 PM, Michael Widenius wrote:Hi!"Alexander" == Alexander Barkov <bar@xxxxxxxxxxx> writes:Alexander> Hi Monty, Alexander> thanks for review. Alexander> I have addressed most of your suggestions. See the new version attached, Alexander> and the detailed comments inline: Alexander> On 09/12/2013 04:32 PM, Michael Widenius wrote:Hi! Here is the review for the code that we should put into 10.0 First the base64:=== modified file 'mysql-test/t/func_str.test' --- mysql-test/t/func_str.test 2013-05-07 11:05:09 +0000 +++ mysql-test/t/func_str.test 2013-08-28 13:14:24 +0000 @@ -1555,3 +1555,118 @@ drop table t1,t2; --echo # End of 5.5 tests --echo # + +--echo # +--echo # Start of 5.6 tests +--echo #Shouldn't this be start of 10.0 tests ? (I know that this code is from MySQL 5.6, but still for us this is 10.0...)Alexander> This code is (almost) copy-and-paste from MySQL-5.6. Alexander> I think it's a good idea when looking inside a test file Alexander> to be able to see which tests are coming from MySQL-5.6, Alexander> and which tests are coming from MariaDB-10.0. Alexander> I'd suggest to keep "Start of 5.6 tests" in this particular case, Alexander> and also when merging the tests for the other merged MySQL-5.6 features. As long a this is an exact copy paste it's ok. <cut>It would also be good to have a good description of the base64 format in this file so that one can understand what the functions are supposed to do (or at least a link to the specification). Some thing I figured out by asking bar on IRC: - We never generate spaces - When decoding, we allow spaces anywhere. - The end of an 64 base decoded string is always padded with '=' so that the final length is dividable with 4.Alexander> I added a reference to http://en.wikipedia.org/wiki/Base64, Alexander> as well all added these your comments in proper places of the code. Thanks. That will make the code much easier to understand. I just read trough the definition and noticed that some versions doesn't use '=' padding. Should we allow not '=' padding in our decoder too? (I think it's best to always pad on encoding).http://dev.mysql.com/doc/refman/5.6/en/string-functions.html#function_to-base64 says: > Different base-64 encoding schemes exist. These are the encoding and > decoding rules used by TO_BASE64() and FROM_BASE64(): > ... > * Encoded output consists of groups of 4 printable characters. Each 3 > bytes of the input data are encoded using 4 characters. If the last > group is incomplete, it is padded with '=' characters to a length of > 4. So we always pad on encoding. So do most of the modern pieces of software. So does PHP: http://php.net/manual/en/function.base64-encode.php So does Linux "bas64" program: $ echo "123" |base64 MTIzCg== This is why we always require pad characters on decoding. I don't think we should accept not-padded values.
Just checked PostgreSQL. They also pad on encode, and do not accept non-properly padded values on decode: bar=> select decode('MQ==','base64'); decode -------- \x31 (1 row) bar=> select decode('MQ=','base64'); ERROR: invalid end sequence bar=> select decode('MQ','base64'); ERROR: invalid end sequence
<cut>=== modified file 'sql/item_strfunc.cc' @@ -451,6 +452,101 @@ void Item_func_aes_decrypt::fix_length_a set_persist_maybe_null(1); } + +void Item_func_to_base64::fix_length_and_dec() +{ + maybe_null= args[0]->maybe_null; + collation.set(default_charset(), DERIVATION_COERCIBLE, MY_REPERTOIRE_ASCII);Maybe better to cast both arguments to (ulonglong) as the rest of the code is depending on this. Also, why is base64_encode_max_arg_length() int instead of uint or even better size_t?Alexander> This is because base64_encode() and base64_decode() are used Alexander> in the replication code using "int" as return value. Alexander> When exposing TO_BASE64() and FROM_BASE64() to the SQL level, Alexander> I did not want to touch the replication code. ok. We should however at some case fix that. <cut>+String *Item_func_from_base64::val_str(String *str) +{ + String *res= args[0]->val_str_ascii(str); + bool too_long= false; + int length; + const char *end_ptr; + + if (!res || + res->length() > (uint) base64_decode_max_arg_length() || + (too_long= + ((uint) (length= base64_needed_decoded_length((int) res->length())) > + current_thd->variables.max_allowed_packet)) || + tmp_value.alloc((uint) length) || + (length= base64_decode(res->ptr(), (int) res->length(), + (char *) tmp_value.ptr(), &end_ptr, 0)) < 0 || + end_ptr < res->ptr() + res->length()) + {Shouldn't we get a readable error or warning if the string contained wrong characters? Now it looks like we will just return null. Something like 'Malformed base64 string. Error at position %d" would be nice.Alexander> Good idea. I have added a warning. Now it looks clear what went wrong Alexander> if FROM_BASE64() returns NULL. Thanks! Regards, Monty
Thread Previous • Date Previous • Date Next • Thread Next |