maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #00242
Re: Per-db binlogging
Sergei,
Thank you very much for your reply! A later post by Henrik Ingo seems
to say that I should cc maria-discuss@xxxxxxxxxxxxxxxxxxx. I'll try it
and see if it bounces--I THINK I'm registered with launchpad.net :-)
Let me know if sending to both lists annoys anyone, and I will quit the cc :-)
See my responses below
On Wed, Sep 15, 2010 at 1:53 PM, Sergei Golubchik <serg@xxxxxxxxxxxx> wrote:
> Hi, Weldon!
>
> On Sep 13, Weldon Whipple wrote:
<snip/>
>> Is it plausible to identify all places where a binlog addition might
>> occur? In those places, if the singleton USER_BIN_LOG_MRG exists, ask
>> it if the current DB is being binlogged. If so, send the request to
>> the DB_BIN_LOG instance.
>
> By "addition" you mean when an event is written to binlog?
Yes! (exactly)--any event relating to that database that is written to
binlog. (Hmm ... I wonder if there might ever be more "global" events
that should be written ... Maybe we don't care about them??). Also,
since these databases are "tied" to a user on a box, we probably don't
care about renaming a DB ...
> See where mysql_bin_log.write is called.
Excellent information! (I was hoping it might be that easy ???). If I
succeed in making a DB_BINLOG class derive from MYSQL_BIN_LOG (with an
instance of DB_BINLOG for each database that is currently binlogging),
I could--in theory, at least--just call something like
db_binlog->write () at those places.
>
>> Questions I Have.
>> ----------------
>>
>> 1. I notice that binlog is in the list of engines and plugins
>> (displayed by SHOW ...). The bottom of log.cc has
>> "mysql_declare_plugin(binlog)". Do I need one for db_plugin?
>> (probably not?)
>
> depends.
> binlog needs to be declated a plugin, to be able to pretent being a
> storage engine, to be able to force two phase commit (2PC).
>
> 2PC is used to commit or rollback a transaction atomically in more than
> one engine. And MySQL only uses 2PC when a transaction spans more than
> one engine. But we want a transaction to be committed in the engine and
> written to a binlog - and it should be done atomically too. So, binlog
> pretends to be an engine, pretends to participate in a transaction,
> which forces MySQL to use 2PC which can guarantee that a transaction
> is either committed and written to a binlog - or not committed and not
> written. Reliably, even if MySQL crashes in the middle.
>
> So, if you don't use innodb, or don't use
> innodb_flush_log_at_trx_commit=1 or don't use sync_binlog=1
> you probably don't need to support 2PC as your setup is not crash proof
> anyway.
>
> If you care about innodb<->binlog consistency in the presence of
> crashes, you should support 2PC in your binlog, as explained above.
We DO have a small percentage of innodb databases. If 2PC commits
(rolls back, etc.) atomically to more than one engine, does that mean
TWO (only)? ... or could it happen for three (specifically, innodb,
MYSQL_BIN_LOG's binlog, and DB_BIN_LOG's binlog)? The two kinds of
binlog files would have the same format (except that one is for only
one database). Are they considered two storage engines/plugins?
(I notice that the "mysql_declare_plugin(binlog)" declaration is at
the bottom of log.cc. Most of the other storage engines' corresponding
declarations are in files in the storage subdirectories. If I were to
need a "mysql_declare_plugin(db_binlog)" declaration, could it also go
at the bottom of log.cc? ... Or (since a DB_BINLOG is AKO [a kind of]
MYSQL_BIN_LOG [in the OO sense]), are MYSQL_BIN_LOGs and DB_BINLOGs
both kinds of binlog plugins ...?
>> 2. I've diagrammed the binary log event taxonomy--the 24 classes that
>> inherit from Log_event (in log_event.cc/h). That was my main
>> motivation for wanting to inherit from MYSQL_BIN_LOG.
>
> Was that a question? :)
No. Just justification for my desire to simplify by having DB_BINLOG
extend the MYSQL_BIN_LOG class. (I realize that I MIGHT have to add
"virtual" in front of a method or two--possibly. Hopefully I would
need to make no--or minimal-- changes to MYSQL_BIN_LOG.)
>> 3. On the other hand, I wonder if inheriting from MYSQL_BIN_LOG might
>> be overkill for "master only" binlogging--without the
>> slave/replication part of it.
>
> No, I think inheriting from MYSQL_BIN_LOG is correct.
> At least it's much easier to build on existing binlogging than to
> implement everything from scratch.
Amen!!
>> 4. What version should I be working on? (We're currently running
>> 5.1.47.) What should I check out of bzr/launchpad? (Is that still
>> what we're doing with the changes in ownership?)
>
> What version you plan to use?
> If it'll be 5.1 - you have to work in 5.1, use lp:mysql-server/5.1
> branch:
>
> bzr branch lp:mysql-server/5.1
We TRY to keep as current as possible (at the company). At some point,
we'll need to move to the next major release--whatever it ends up
being. (Then there is also the unknown [to me] of MySQL vs. MariaDB
... :-)
We'll use bzr branch lp:mysql-server/5.1
>
>> 5. I see that THD is involved in binlogging. (Several THD methods are
>> defined in log.cc.) Are there any caveats here?
>
> There is a binlog cache (or buffer) in THD that stores binlog events for
> the current transaction. They all are written to disk together, when a
> transaction is committed. You need to decide what you want to do when
> the current database is changed in the middle of a transaction - what
> binlog the transaction should be written to.
By "the current database is changed", I imagine you refer to the "USE
'database';" command. That's a good question. I THINK we have decided
to ignore (for the present, at least), queries (and also
transactions?) that involve multiple databases. Because we are doing
per-DB binlogging, if multiple databases are involved, we (almost)
consider it an anomaly at our ISP ... (I'm thinking that we might need
to consider THD's binlog cache only for transactions that involve
innodb and no more than one database ...)
>> 6. What files should I use? A new h/cc pair? ... or just add to the
>> hierarchy in log.h/cc?
>
> As you like.
> New pair of files is cleaner. But it may require you to move some
> declarations from log.cc to a header file, or to make some static
> functions extern.
All my prototypes have just expanded what is in log.cc and log.h. (I
came VERY close to creating db_binlog.cc/h, however, which I would
have done in some of my past--non-MySQL--projects). However, noticing
most of the logging hierarchy in the single log.cc/h pair, I decided
the implementors must have done it for a reason, and have decided to
follow suit--for now, at least.
>
>> What I've Implemented So Far (about 13 Prototypes).
>> --------------------------------------------------
>>
>> (PRELIMINARY NOTE:)
>>
>> Our initial plan was to call it "Per-User Binlogging". The user here
>> is a cPanel [control panel] user. cPanel's database naming convention
>> is to have all of a user's database names start with the username,
>> followed by an underscore, followed by a distinctive database
>> name. Thus all the commands above start with
>>
>> <verb> USER ...
>>
>> Originally most of them could end either "DATABASE <db>" or "USER
>> <user>" (with user meaning cPanel user). Thus, in the original plan
>> the first command above was:
>>
>> START USER BINLOG FOR (USER <user> | DATABASE <db>)
>>
>> We have realized that per-database binlogging is all we really need.
>> We imagine that others in our situation might want something similar.
>> Since not everyone uses cPanel, we should forget per[cPanel]-user
>> binlogging, focusing on per-database binlogging.
>
> If every user is a different MySQL account - that it a different row in
> the mysql.user table - than per-user binlogging is much cleaner and
> better defined concept.
>
> A statement can affect more than one database (say, UPDATE db1.t1,
> db2.t2 SET ...), the current database may be changed in the middle of
> transaction, and so on. On the other hand, any statement or a
> transaction is always executed in the context of as single user - you
> cannot change user in the middle of transaction. (*)
>
> (*) changing security context only affect privileges, but not the
> destination binlog.
Actually (and this is because of the way cPanel does things--I have
been "bitten" by this many times!!), each database has a (different)
corresponding user in the mysql.user table (!!!).
So (for example), if a user signs up with our ISP with a user name of
elvispre
every one of his (perhaps hundreds/thousands of) databases will have
names something like:
elvispre_friends
elvispre_earlyyears
elvispre_hitsongs
.
.
.
And (here's the surprising part)--each one of those databases will
have a MATCHING ROW IN mysql.user!!
So, for us to migrate our customer elvispre to another physical
server, we DO (ultimately) need to binlog all of the databases/users
that match/are like 'elvispre\_%'; .
In some of my earlier prototypes, the
START/FLUSH-LOCK/SHOW/UNLOCK/STOP commands have allowed for "like"
clauses at the end. Maybe I need to go that route?
ALL of my prototypes have had binlogging PER DATABASE. I wonder how
difficult/feasible/desirable/useful it would be to create separate
binlog files for databases (or users) whose name matches a "like"
clause. ...
Probably more important: Would this feature be of general interest to
the MySQL user community at large? (... and, if so, should it be
database- or user- or database_pattern- or user_pattern-based).
(My initial idea is that per [single] database/user might be easiest
to define/implement??)
>
> Regards,
> Sergei
Thanks again for your reply!!
Weldon
Follow ups