← Back to team overview

maria-developers team mailing list archive

Re: [MariaDB/server] Compressed binary log (#247)

 

GCSAdmin <notifications@xxxxxxxxxx> writes:

> We add new event types to support compress the binlog as follow:
>      QUERY_COMPRESSED_EVENT,
>      WRITE_ROWS_COMPRESSED_EVENT_V1,
>      UPDATE_ROWS_COMPRESSED_EVENT_V1,
>      DELETE_POWS_COMPRESSED_EVENT_V1,
>      WRITE_ROWS_COMPRESSED_EVENT,
>      UPDATE_ROWS_COMPRESSED_EVENT,
>      DELETE_POWS_COMPRESSED_EVENT
>

Thanks for the patch!

Overall, I'm much impressed with the quality, I see a lot of attention to
detail. Below, I have a number of comments and suggestions, but they are all
minor for a patch of this complexity.

A number of the comments are style fixes or simple changes, and I found it
easier to just do the changes to the source for you to review. I hope that
is ok. I rebased the series without intermediate merge to simplify review to
a single patch. The rebase with my suggested changes on top is here:

  https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

Please check the changes I made and let me know if you disagree with
anything or I misunderstood something. Changes are mostly:

 - A bunch of code style (indentation, lines <= 80 chars, clarify comments,
   and so on).

 - Change LOG_EVENT_IS_QUERY() etc. from macros to static inline functions.

 - check_event_type() updated (new code merged into 10.2 recently).

 - Minor .result file update, also from code merged into 10.2 recently.

I also have the following questions/suggestions:

1. I think the code should sanity-check the compressed event header, to
check that it does not access outside the event buffer in corrupt
events. Eg. if lenlen is 7 and there is less than 7 bytes available in the
event. I know that replication code in general is not robust to corrupt
events, but it seems best that new code avoids overflowing buffers in case
of corrupt event data.

2. There are some places where the compressed event types are not added to
switch() or if() statements, mainly related to minimal binlog images, I
think: Rows_log_event::read_write_bitmaps_cmp(),
Rows_log_event::get_data_size(), Rows_log_event::do_apply_event(), 
Rows_log_event::write_data_body(). I *think* the reason for that is that the
SQL thread never sees the compressed events (they are uncompressed by the IO
thread), but I would like your confirmation that my understanding is
correct.

3. Did you think about testing that BINLOG statements with compressed events
can execute correctly? This happens with mysqlbinlog | mysql, when there are
(compressed) row-based events in the binlog. I'm wondering if this could
expose compressed events to code that normally runs in the SQL thread and
expects to have events uncompressed for it by the IO thread?

A few detailed comments on the patch below (most comments are done as
changes in the above-linked git branch).

Otherwise, after answer to above 3 questions, I think the patch looks good
to go into 10.2.

 - Kristian.

-----------------------------------------------------------------------

> +#define BINLOG_COMPRESSED_HEADER_LEN 1
> +#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
> +/**
> +  Compressed Record
> +    Record Header: 1 Byte
> +                   0 Bit: Always 1, mean compressed;
> +                   1-3 Bit: Reversed, compressed algorithm??Always 0, means zlib
> +                   4-7 Bit: Bytes of "Record Original Length"
> +    Record Original Length: 1-4 Bytes
> +    Compressed Buf:  

I did not understand this. Why "reversed"? It seems to be bits 0-2 that have
the bytes of "Record Original Length" and bit 7 that has the '1' bit meaning
compression enabled? Maybe this can be clarified.


> +int query_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,
> +						   const char *src, char* buf, ulong buf_size, bool* is_malloc,
> +               char **dst, ulong *newlen)
> +{
> +  ulong len = uint4korr(src + EVENT_LEN_OFFSET);
> +	const char *tmp = src;
> +
> +  DBUG_ASSERT((uchar)src[EVENT_TYPE_OFFSET] == QUERY_COMPRESSED_EVENT);
> +
> +	uint8 common_header_len= description_event->common_header_len;
> +	uint8 post_header_len= description_event->post_header_len[QUERY_COMPRESSED_EVENT-1];
> +
> +	tmp += common_header_len;
> +
> +	uint db_len = (uint)tmp[Q_DB_LEN_OFFSET];
> +	uint16 status_vars_len= uint2korr(tmp + Q_STATUS_VARS_LEN_OFFSET);
> +
> +	tmp += post_header_len + status_vars_len + db_len + 1;
> +
> +	uint32 un_len = binlog_get_uncompress_len(tmp);
> +	*newlen = (tmp - src) + un_len;

This is one place I would like to see a check on the data in the
event. Maybe just that 'len' is larger than 1+(tmp[0]&7). Or maybe simply
that len is at least 10, the minimal size for compressed events.

> +int row_log_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,
> +                             const char *src, char* buf, ulong buf_size, bool* is_malloc,
> +                             char **dst, ulong *newlen)
> +{

> +	uint32 un_len = binlog_get_uncompress_len(tmp);
> +	*newlen = (tmp - src) + un_len;

Another place where I'd like a check against looking outside the event
buffer.

> +int binlog_buf_uncompress(const char *src, char *dst, uint32 len, uint32 *newlen)
> +{
> +	if((src[0] & 0x80) == 0)
> +	{
> +		return 1;
> +	}
> +
> +	uint32 lenlen = src[0] & 0x07;
> +    uLongf buflen  = *newlen;
> +	if(uncompress((Bytef *)dst, &buflen, (const Bytef*)src + 1 + lenlen, len) != Z_OK)

Again check on length.


Follow ups