← Back to team overview

maria-developers team mailing list archive

WL40: Current status & issues to discuss

 

Bo suggested me to publish my last WL#40 status report on the dev list.
Meaning to get a feedback (critical notes, tips, suggestions, etc) I
rewrote the report to make it more readable and included some thoughts
for discussion.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

WL40. Add a mysqlbinlog option to filter updates to certain tables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Selected option : "2.5 Extend Query Events With Tables Info".

  1. Extending Query Events With Tables Info
  2. Example of the Extended Query Event Binary Format
  3. Adding binlog-with-tables-info Option
  4. How Tables Info is Formed
     4.1. CREATE TABLE statement
     4.2. RENAME TABLE Statement
  5. TODO: Is Tables Info Enough?
  6. TODO: How Renames are to be filtered?
  7. Other TODO's

1. Extending Query Events With Tables Info
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reason: to do filtering based on "*-table" options we need to have a
list of all tables refered to by the query described by Query event.

The Query event binary format is extended with a tables info, i.e. the list of
tables involved in the query execution. More precisely, two following fields are
added to the Query event header:

  * Query length (2 bytes). (Reason: in usual format, Query event doesn't include
    this field because the query is considered to extent to the end of the event
    what is not the case for the extended format).

  * Tables info length (2 bytes). Length of tables info contained in the data
    part of the event.

and the following data is added to the Query event data part:

  * Tables info (tables_info_len bytes). List of 4-tuples
    (db_len, db, table_name_len, table_name) for each table involved in the
    execution of the query described by this event.
    db_len may be zero meaning that this is the default db.

Note. In the extended format the default db is written without trailing zero
which is redundant since its length is already known from the event header.

2. Example of the Extended Query Event Binary Format
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

************************
QUERY_EVENT [2]
************************
00000294 | 2F A1 00 4B | time_when  = 1258332463
00000298 | 02          | event_type = 2
00000299 | 64 00 00 00 | server_id  = 100
0000029D | 75 00 00 00 | event_len  = 117
000002A1 | 09 03 00 00 | log_pos    = 00000309
000002A5 | 10 00       | flags      = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
------------------------
000002A7 | 01 00 00 00 | proxy_id        = 1
000002AB | C9 00 00 00 | exec_time       = 201
000002AF | 04          | dbase_len       = 4
000002B0 | 00 00       | error_code      = 0
000002B2 | 1A 00       | status_vars_len = 26
000002B4 | 2B 00       | query_len       = 43   (*)
000002B6 | 08 00       | tables_info_len = 8    (*)
------------------------
Status Variables
------------------------
000002B8 | 00          | FLAGS2:
000002B9 | 00 40 00 00 | 0x00004000
000002BD | 01          | SQL_MODE:
000002BE | 00 00 00 00 | 0x0000000000000000
000002C2 | 00 00 00 00 | 
000002C6 | 06          | CATALOG_NZ:
000002C7 | 03          | catalog_len = 3
000002C8 | 73 74 64    | catalog = "std"
000002CB | 04          | CHARSET_CODE:
000002CC | 08 00       | charset_client    = 8
000002CE | 08 00       | collation_connect = 8
000002D0 | 08 00       | collation_server  = 8
------------------------
000002D2 | 74 65 73 74 | dbase = "test"         (**)
000002D6 | 52 45 4E 41 | query = "RENAME TABLE t1 TO tmp, t2 TO t1, tmp TO t2"
000002DA | 4D 45 20 54 |
000002DE | 41 42 4C 45 |
000002E2 | 20 74 31 20 |
000002E6 | 54 4F 20 74 |
000002EA | 6D 70 2C 20 |
000002EE | 74 32 20 54 |
000002F2 | 4F 20 74 31 |
000002F6 | 2C 20 74 6D |
000002FA | 70 20 54 4F |
000002FE | 20 74 32    | 
------------------------
Tables info                                     (*)
------------------------
00000301 | 00          | dbase_len = 0          (***)
00000302 | 02          | table_len = 2
00000303 | 74 31       | table_str = "t1"
00000305 | 00          | dbase_len = 0          (***)
00000306 | 02          | table_len = 2
00000307 | 74 32       | table_str = "t2"
************************

(*)   - Added to the new Query event binary format.
(**)  - Modified: in the new Query event binary format database name does
        not contain trailing zero.
(***) - Database length = 0 means that it is the default database.

3. Adding binlog-with-tables-info Option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
New --binlog-with-tables-info option is added. When set, all Query events are
to be written to the extended format (with tables info). Reasons for adding
the option:

  * Adding tables info leads binary log to grow in size what is superfluous
    if filtering within mysqlbinlog is not supposed to be used.

  * If the current implementation of the extended format will appear to have
    bugs switching off this option will guarantee that replication will work
    as before that.
    
The option is defined as GLOBAL and read-only. Reason:

  * The Query events format used in the binary log can be recognized only
    by its length contained in the Format description event at the beginning
    of the binary log. Because of that, two different Query event formats
    can not be mixed within one binary log.

4. How Tables Info is Formed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Forming tables info for some types of statement is not quite straightforward.
Below is two examples of those "not-straightforward" cases.

4.1. CREATE TABLE statement
~~~~~~~~~~~~~~~~~~~~~~~~~~~
  CREATE [TEMPORARY] TABLE [IF NOT EXISTS] t (...)
  CREATE [TEMPORARY] TABLE [IF NOT EXISTS] t LIKE t1
  CREATE [TEMPORARY] TABLE [IF NOT EXISTS] t SELECT (t1, ..., tn)

Implementation of forming tables info in this case is based on the following
rules of binlogging CREATE TABLE statements:

  SBR. The original CREATE TABLE statement is always binlogged as is.
       In this case
          tables_info = { t }, { t, t1 }, or { t, t1, ..., tn }
       respectively.

  RBR. In case of RBR we have the following specifics of binlogging CREATE
       TABLE statement:
       
       * temporary tables are not binlogged;
       * CREATE TABLE t LIKE t1 with temporary t1 is transformed to the
         ordinary CREATE TABLE (...);
       * CREATE TABLE SELECT is always transformed to the ordinary
         CREATE TABLE (...) (with possible subsequent INSERT statements).

       So for RBR tables info is formed as follows:

       IF t is temporary
         // the statement will not be binlogged
         tables_info = empty
       IF CREATE TABLE t (...)
         // the statement will be binlogged as is
         tables_ info = {t}
       IF CREATE TABLE t LIKE t1
         IF t1 is temporary 
             // the statement will be binlogged as CREATE TABLE t (...)
             tables_info = {t}
         ELSE
             // the statement will be binlogged as is
             tables_info = {t, t1}
       IF CREATE TABLE t SELECT (t1, ..., tn)
         // the statement will be binlogged as CREATE TABLE (...)
         tables_info = {t}

Note. In case of RBR, CREATE TABLE SELECT is binlogged as the following
sequence of events:
  BEGIN:        Query event
  CREATE TABLE: Query event
  INSERT's:     Write row events
  COMMIT        Query event
BEGIN and COMMIT in this case are supplied with the same tables_info = {t}
as the CREATE TABLE statement. Also INSERT's are preceded by the Table map
event refering to the same table t. So filtering of all this events
(including BEGIN and COMMIT) will be based on the same tables info and
either all of them will pass the filter or all of them will be filtered out.

4.2. RENAME TABLE Statement
~~~~~~~~~~~~~~~~~~~~~~~~~~~
  RENAME TABLE tl TO s1, ..., tn TO sn

For this type of statements tables info is formed, roughly speaking, of tables
standing left to "TO", i.e. t1, ..., tn. But including all t1, ..., tn is not
always correct. Consider renaming t1 <-> t2:

  RENAME TABLE t1 TO tmp, t2 TO t1, tmp TO t2

Setting tables_info = ( t1, t2, tmp } is not good: mysqlbinlog being executed
with do-tables = { t1, t2 } will filter this statement away because tmp does
not belong to do-tables. So before adding ti to the tables_info we have to
check that ti != sj for all j < i. See tables_info in the example of Section 2.

5. TODO: Is Tables Info Enough?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Assume we run mysqlbinlog with ignore_tables = { t1 } for the following
sequence of statements:

(1)   CREATE TABLE t1 (...);
      ...
(2)   CREATE TABLE t2 LIKE t1;
      ...
(3)   INSERT INTO t2 VALUES ...;
      ...
  
In this case:
  (1) will be filtered away - directly stated by the ignore rules;
  (2) will be filtered away - can not be executed because refers to t1
  (3) will be left          - though can not be executed

It is not good to have erroneous statements in mysqlbinlog output.
We can overcome this in two ways:

1. To explain to the user that in those cases, to avoid erroneous
statements, he/she should add t2 to the ignore_tables list as well.
 
2. To make the filtering bit clever. E.g., additionally to tables info,
we may extend the Query event format with a Statement type field
(BTW, see also WL#41 "Add a mysqlbinlog option to filter certain kinds
of statements"). In this case, when filtering, if we reject a "CREATE
TABLE t ..." statement because its LIKE or SELECT part refers to an
ignored table, we add table t to the ignore_tables list ourself.

6. TODO: How Renames are to be filtered?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider the RENAME TABLE statement. If we have do_tables = { t } and
t is renamed to s:

(1)   RENAME TABLE t TO s;
      <smth for s>

what the filter should do? Should we consider t and s to be the same
table with different names so that <smth for s> is not to be filtered
away (in this case during filtering process we should add s to the
do_tables list - like we do it in the case of CREATE TABLE)? Or should
we treat do- ignore-table rules as talking about *table names* so that
--do-table=t means "do only table with name t" - then we filter
<smth for s> away because it refers to the table with another name s?
Note that in the last case if we rename s back to t:

(2)   RENAME TABLE s TO t;
      <smth for t>
      
the RENAME statement will be filtered away (it renames a table which
does not belong to the do_tables list) while <smth for t> will pass
the filter and its subsequent execution will result with error (we
can resolve this by creating a rename_list during the filtering process
to trace all renamings specified by the processed events - similar
to what is described in Section 4.2).

Note. With the second treatment of renaming, we can do the following
(looks like attractive but not sure is usefull :)

    <smth for t>
    RENAME TABLE t TO s
    <smth for s that can be filtered away>
    RENAME TABLE s TO t
    <smth for t>
    
7. Other TODO's
~~~~~~~~~~~~~~~
* Continue with forming tables_info for different types of query.
* Filtering algorithm and *-table options for mysqlbinlog.
  Should it be "clever" or "stupid" (see Sections 5 & 6) ?
* Add Statement type field to the Query event format (?)
  Needed for "clever' filtering and for WL#41
* Other stuff (testing, documenting, etc)




Follow ups