← Back to team overview

maria-developers team mailing list archive

Review 2 of compressed columns

 

Hi!

Here is a review for the patches marked with
fixup! MDEV-11371 - column compression

> commit f1a3cd40ecc4c144ac32d43ab5c1e88c8e4fae7e
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Mon Jul 31 16:07:39 2017 +0400
>
>     fixup! MDEV-11371 - column compression
>
>     In worst case we have to store data length + 1 for compressed data,
>     we should increase field length by 1 internally so that data is not
>     truncated.
>
> diff --git a/mysql-test/t/column_compression.test b/mysql-test/t/column_compression.test
> index 256e96c..d3f8481 100644
> --- a/mysql-test/t/column_compression.test
> +++ b/mysql-test/t/column_compression.test
> @@ -54,8 +54,8 @@ SHOW CREATE TABLE t1;
>  --cat_file $MYSQLD_DATADIR/test/t1.CSV
>  DROP TABLE t1;
>
> ---echo # Test fields with length equal to data length
> -CREATE TABLE t1(a VARCHAR(10) COMPRESSED);
> +--echo # Test fields that don't fit data
> +CREATE TABLE t1(a VARCHAR(9) COMPRESSED);
>  --error ER_DATA_TOO_LONG
>  INSERT INTO t1 VALUES(REPEAT('a', 10));
>  INSERT INTO t1 VALUES(REPEAT(' ', 10));

Please also insert a correct row, and not only edge cases.
(You may do it already, but I can't see that in the diff)

> @@ -70,3 +70,12 @@ INSERT INTO t1 VALUES(REPEAT(' ', 255));
>  SET column_compression_threshold=DEFAULT;
>  SELECT a, LENGTH(a) FROM t1;
>  DROP TABLE t1;
> +
> +--echo # Corner case: VARCHAR(255) COMPRESSED must have 2 bytes pack length
> +CREATE TABLE t1(a VARCHAR(255) COMPRESSED);
> +SHOW CREATE TABLE t1;
> +SET column_compression_threshold=300;
> +INSERT INTO t1 VALUES(REPEAT('a', 255));
> +SET column_compression_threshold=DEFAULT;
> +SELECT a, LENGTH(a) FROM t1;
> +DROP TABLE t1;

Please insert also a row without column_compression_threshold, just
to cover all cases in the test case.

> diff --git a/sql/field.cc b/sql/field.cc
> index 216fb95..e1dc193 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -7555,7 +7555,8 @@ void Field_varstring::sql_type(String &res) const
>    length= cs->cset->snprintf(cs,(char*) res.ptr(),
>                               res.alloced_length(), "%s(%d)",
>                                (has_charset() ? "varchar" : "varbinary"),
> -                             (int) field_length / charset()->mbmaxlen);
> +                             (int) field_length / charset()->mbmaxlen -
> +                             MY_TEST(compression_method()));
>    res.length(length);
>    if ((thd->variables.sql_mode & (MODE_MYSQL323 | MODE_MYSQL40)) &&
>        has_charset() && (charset()->state & MY_CS_BINSORT))
> @@ -10015,6 +10016,8 @@ void Column_definition::create_length_to_internal_length(void)
>    case MYSQL_TYPE_STRING:
>    case MYSQL_TYPE_VARCHAR:
>      length*= charset->mbmaxlen;
> +    if (sql_type == MYSQL_TYPE_VARCHAR && compression_method())
> +      length++;

Add a comment:

/*
  Compressed columns needs one extra byte to store the compression method.
  This byte is invisible to the end user, but not for the storage engine.
*/

>      key_length= length;
>      pack_length= calc_pack_length(sql_type, length);
>      break;

<cut>

> commit 274ab73a89573554d07a6de1deb846ad47eb1004
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Thu Jul 27 13:06:33 2017 +0400
>
>     fixup! MDEV-11371 - column compression
>
>     Fixed BLOB COMPRESSED -> BLOB COMPRESSED replication.
>
>     The problem was that Field_blob::unpack gets compressed data and calls
>     Field_blob_compressed::store(), which compressed it again. Fixed by calling
>     Field_blob::set_ptr() instead of ::store().
>
>     Note that now no data copy is performed and blob is pointing to data allocated
>     by replication routines. It used to work this way before 06fb8c2d10, but still
>     should be reviewed carefully to make sure no copy is actually needed.
>

patch ok.

> commit 93e00970100503c3f365063d9261119f096bee48
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Thu Jul 27 11:44:20 2017 +0400
>
>     Get rid of Field::do_save_field_metadata()
>
>     It doesn't serve any purpose, but generates extra virtual function call.
>

ok

> commit e1247d4991bb4ebd3f98788c7bfe31a34faf7b69
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Thu Jul 27 10:58:47 2017 +0400
>
>     fixup! MDEV-11371 - column compression
>
>     This is an addition to "Fixed compressed columns to work with partitioning".
>
>     Mroonga calls key_copy() on write/delete, when column is not in read_set.
>     Adjust read_set for the duration of val_str().
>
> diff --git a/sql/field.cc b/sql/field.cc
> index ec82984..ab96ee9 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -7659,8 +7659,12 @@ uint Field_varstring::get_key_image(uchar *buff, uint length,
>  {
>    String val;
>    uint local_char_length;
> +  my_bitmap_map *old_map;
>
> +  old_map= dbug_tmp_use_all_columns(table, table->read_set);
>    val_str(&val, &val);
> +  dbug_tmp_restore_column_map(table->read_set, old_map);
> +

Shouldn't this be in mronga code and not in MariaDB?

Looking at the original code, we didn't check for read_set before either
for get_key_image, so I assume this is ok.

>    local_char_length= val.charpos(length / field_charset->mbmaxlen);
>    if (local_char_length < val.length())
>      val.length(local_char_length);


> commit 94040156d8007663beb6785ca653d41b26758301
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Wed Jul 26 17:28:57 2017 +0400
>
>     fixup! MDEV-11371 - column compression
>
>     Get rid of hardcoded "8" (zlib compression method id) by moving towards
>     support for multiple compression methods.


> commit 6c8b1e563c7456eb0b8ca11a621333237de37ec4
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Tue Jul 25 13:27:07 2017 +0400
>
>     fixup! MDEV-11371 - column compression
>
>     Fixed compressed columns to work with partitioning.
>
>     Reviewer note: this patch reuses get_key_image()/set_key_image() of
>     Field_varstring to avoid code duplication. Cons: this adds virtual function
>     calls val_str() and store().
>

ok

> commit 9c85ad8d105e74baae0b6aba0db01b4d6aac7ba6
> Author: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> Date:   Thu Jul 20 16:57:46 2017 +0400
>
>     fixup! MDEV-11371 - column compression


ok.

The only thing left is now to fix replication to handle compressed
columns. We also have agreed to handle the case where the master and
slave have different COMPRESSED column attributes for VARCHAR and
BLOB.

Regards,
Monty