← Back to team overview

maria-developers team mailing list archive

Re: [MariaDB/server] Compressed binary log (#247)

 

Simon Mudd <simon.mudd@xxxxxxxxxxx> writes:

> I have not looked at the code in detail but a couple of questions
> related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know from
reviewing the patch:

> (1) Why not use a single compressed event type and encapsulate the whole original event inside ?

This would result in higher overhead on each event. There is a fixed header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating another
full compressed event would require two headers, one uncompressed and one
compresed.

There might be other pros and cons to each method, this was just what
occured to me.

> Is it not possible to have a "generic compressed event”? Mark mentions
> allowing selection of the compression type and that seems useful and I
> would expect the the compressed event to have this information and the
> compressed content.

The current patch has reserved three bits for the compression method. Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Actually, it would be good if the code would check the compression type and
fail if it is unknown. This will help get proper errors on old slaves if new
compression types are added in the future.

> (2) Senders would try to compress the event as requested. If the
> compressed event is not any smaller then do not bother compressing it,
> just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

> (3) Receivers on receiving a compressed event would decompress it and
> then process the extracted event in the same way it’s processed
> now. Most of the code should haven no idea the event had been
> compressed.

That's basically how the patch is written.

> Perhaps I am missing something but the need to compress events
> individually probably requires quite a bit more logic and will not be
> as extensible.

Perhaps. In any case, the patch at hand is based on individually compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

 - Kristian.


Follow ups

References