← Back to team overview

maria-developers team mailing list archive

Re: MDEV-6877 - binlog_row_image implementation advice


Hi Kristian,

I've done more work on binlog row image and I think it is nearly ready to
be merged with 10.1. I still need to import all mysql's test cases, however
none of our rpl tests fail. There are 2 test cases: rpl.rpl_not_null_innodb
rpl.rpl_not_null_myisam that do have a different output when run with
minimal row image. This output however is normal, considering the slave and
master have different table specifications. It is a corner case that is not
treated in MySQL either.

I'd like a review from you for the whole patch, while I work on adding all
the relevant tests from MySQL. To me, it seems like a pretty big change
with lots of implications. I tried to cover all of them. Let me know if you
have any questions or concerns.



On Wed, 29 Apr 2015 at 11:37 Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>

> Vicențiu Ciorbaru <vicentiu@xxxxxxxxxxx> writes:
> > I've gotten to a "sort of" stable state with binlog_row_image. It's not
> > 100% complete as I did not add any tests for the new use cases.
> > Also there are a couple of changes that I am not 100% sure if they are
> > complete or correct.
> >
> > I would like some input on these changes:
> >
> https://github.com/MariaDB/server/compare/10.1-MDEV-6877-binlog_row_image
> I read through the changes (10 commits). Generally, it seems ok. I assume
> you have taken the code from the MySQL tree (5.6 or 5.7?), and only changed
> it as necessary to work with any local MariaDB changes?
> > The unpack_current_row being a very iffy change for me. I understand why
> it
> > is needed in the Update_rows_event::do_exec_row code, as the after image
> > has other columns that get written as opposed to the before image. Thing
> is
> > I've just made it in an attempt to fix a HA_ERR_SAME_KEY error that I
> kept
> > getting on the slave and it seems to have worked.
> I don't understand here.
> You wrote "I've just made it" - why could you not just use the same code
> that MySQL uses?
> But I think you meant something else. Maybe this is caused by some changes
> done in one of the trees but not the other, and that needs to be
> merged/adapted?
> One thing done in MariaDB is that most code now uses rpl_group_info rather
> than Relay_log_info, due to parallel replication. MySQL may have done
> something conceptually similar, but different implementation.
> In general, it would seem best to follow MySQL/Oracle code as closely as
> possible. We should not have changed the code for applying row events much
> compared to MySQL. However, there may of course be other things that MySQL
> did that we have not yet merged.
> > Do you think the implementation that MySQL chose is the correct one? The
> I suppose the MySQL one should be correct. In general, there does not seem
> to be much going on in the patches (though I'm sure digging up exactly
> which
> code to carry over has taken a lot of effort on your part). It seems most
> of
> the required functionality is already in the code. Though it is hard for me
> to see if anything might be missing.
> Maybe there are some test cases from MySQL that can be taken over?
> Or maybe they run the existing test suite with
> --mysqld=--binlog-row-image={NOBLOB,MINIMAL,FULL}. This is something that
> we
> would benefit a lot from doing also, I think...
> > changes are mostly the same as the MySQL variant, except they have V2 ROW
> > LOG EVENTS which seem to pass some extra info which we don't use (need?).
> I think the extra info in V2 rows is only used by NDB, something we do not
> have in our tree. And I think reading V2 events were ported by Serg to
> MariaDB already (just ignoring the extra info). So I think generally the V2
> can just be ignored, except to be able to read those events...
> Some answers to specific comments you put on github:
> > This code is not used and prevents compilation when changing the
> prototype
> > of binlog_write_row. As discussed with Sergei Golubchik, this whole code
> > is up for the chopping block after this feature is complete.
> Yes. Binlog injector is for NDB, which we do not have in our tree.
> > Same of old log events, but I don't know exactly where these are used. I
> > could use a bit of help with this part.
> I think log_event_old.{h,cc} is for backwards compatibility with really old
> versions of row-based replication. It may even be for event format that was
> never released (only alpha/beta). I think just updating the code without
> further action should be ok. I am not aware of a way to test this part of
> the code.
> > I actually don't understand why we need this. This was copy pasted from
> > MySQL and is on my TODO list to figure out why.
> Well, THD::binlog_prepare_row_images() changes the table->read_set to point
> to table->tmp_set, so I suppose that is why they restore it...
> > This doesn't seem to make any sense for me. This is what MySQL does but
> > does ha_update_row care?
> Yeah, I don't understand either, but I suppose the comment "Temporary fix
> to
> find out why it fails [/Matz]" is a clue that it is some debugging code,
> possibly forgotten (Mats Kindahl is one of the main replication
> developers). You could try asking him in a mail (mats.kindahl@xxxxxxxxxx).
> > There are some things that need to be fixed within that patch, but I'm
> > looking to know if it's going in the right direction or not, before
> > investing any more time into it.
> It seems ok to me.
> Hope this helps,
>  - Kristian.

Follow ups