← Back to team overview

maria-developers team mailing list archive

Re: MDEV-6877 - binlog_row_image implementation advice


Vicențiu Ciorbaru <vicentiu@xxxxxxxxxxx> writes:

> I've gotten to a "sort of" stable state with binlog_row_image. It's not
> 100% complete as I did not add any tests for the new use cases.
> Also there are a couple of changes that I am not 100% sure if they are
> complete or correct.
> I would like some input on these changes:
> https://github.com/MariaDB/server/compare/10.1-MDEV-6877-binlog_row_image

I read through the changes (10 commits). Generally, it seems ok. I assume
you have taken the code from the MySQL tree (5.6 or 5.7?), and only changed
it as necessary to work with any local MariaDB changes?

> The unpack_current_row being a very iffy change for me. I understand why it
> is needed in the Update_rows_event::do_exec_row code, as the after image
> has other columns that get written as opposed to the before image. Thing is
> I've just made it in an attempt to fix a HA_ERR_SAME_KEY error that I kept
> getting on the slave and it seems to have worked.

I don't understand here.
You wrote "I've just made it" - why could you not just use the same code
that MySQL uses?
But I think you meant something else. Maybe this is caused by some changes
done in one of the trees but not the other, and that needs to be

One thing done in MariaDB is that most code now uses rpl_group_info rather
than Relay_log_info, due to parallel replication. MySQL may have done
something conceptually similar, but different implementation.

In general, it would seem best to follow MySQL/Oracle code as closely as
possible. We should not have changed the code for applying row events much
compared to MySQL. However, there may of course be other things that MySQL
did that we have not yet merged.

> Do you think the implementation that MySQL chose is the correct one? The

I suppose the MySQL one should be correct. In general, there does not seem
to be much going on in the patches (though I'm sure digging up exactly which
code to carry over has taken a lot of effort on your part). It seems most of
the required functionality is already in the code. Though it is hard for me
to see if anything might be missing.

Maybe there are some test cases from MySQL that can be taken over?
Or maybe they run the existing test suite with
--mysqld=--binlog-row-image={NOBLOB,MINIMAL,FULL}. This is something that we
would benefit a lot from doing also, I think...

> changes are mostly the same as the MySQL variant, except they have V2 ROW
> LOG EVENTS which seem to pass some extra info which we don't use (need?).

I think the extra info in V2 rows is only used by NDB, something we do not
have in our tree. And I think reading V2 events were ported by Serg to
MariaDB already (just ignoring the extra info). So I think generally the V2
can just be ignored, except to be able to read those events...

Some answers to specific comments you put on github:

> This code is not used and prevents compilation when changing the prototype
> of binlog_write_row. As discussed with Sergei Golubchik, this whole code
> is up for the chopping block after this feature is complete.

Yes. Binlog injector is for NDB, which we do not have in our tree.

> Same of old log events, but I don't know exactly where these are used. I
> could use a bit of help with this part.

I think log_event_old.{h,cc} is for backwards compatibility with really old
versions of row-based replication. It may even be for event format that was
never released (only alpha/beta). I think just updating the code without
further action should be ok. I am not aware of a way to test this part of
the code.

> I actually don't understand why we need this. This was copy pasted from
> MySQL and is on my TODO list to figure out why.

Well, THD::binlog_prepare_row_images() changes the table->read_set to point
to table->tmp_set, so I suppose that is why they restore it...

> This doesn't seem to make any sense for me. This is what MySQL does but
> does ha_update_row care?

Yeah, I don't understand either, but I suppose the comment "Temporary fix to
find out why it fails [/Matz]" is a clue that it is some debugging code,
possibly forgotten (Mats Kindahl is one of the main replication
developers). You could try asking him in a mail (mats.kindahl@xxxxxxxxxx).

> There are some things that need to be fixed within that patch, but I'm
> looking to know if it's going in the right direction or not, before
> investing any more time into it.

It seems ok to me.

Hope this helps,

 - Kristian.

Follow ups