← Back to team overview

maria-discuss team mailing list archive

Re: MariaDB Server 10.3 notes


Hi, Justin!

Very good questions, thanks!
Some answers below:

On Oct 12, Justin Swanhart wrote:
> > * InnoDB: InnoDB native partitioning - so MySQL 8 InnoDB? But Monty
> > says there's next to no changes in InnoDB 8...  Instant add column.
> > New InnoDB deadlock detection (8.0). New INFORMATION_SCHEMA table
> > (8.0). Dedicated tablespace for temporary tables (in 5.7 and merged,
> > check). Lock wait policy (contribution)
> Monty is *notorious* for low balling estimates.  His famous phrase is "it
> is trivial".  Everybody knows that if Monty says it is trivial, you can add
> 10x the work to get it done, or more.

Yes, I tend to agree with that. But Monty estimates are not that far off
when applied to *Monty*. They're often too low when applied to others.


> includes transportable tablespaces for partitioned tables, and will likely
> support native partitioning by the time 8 rolls.  Native partitioning also
> entails implementing the changes to the SE.  See:
> http://mysqlserverteam.com/innodb-native-partitioning-early-access/

InnoDB native partitioning is in 5.7.6:

Is it better than upper-layer partitioning? Why?

> > * more for window functions (user defined window functions, MDEV-10855)
> What other databases have user definable window functions?  Will the syntax
> be similar?  I can't really think of a reason to need custom window
> functions personally, as the existing set is very comprehensive.  What kind
> of custom window function do you think people would need to write?  In most
> implementations any aggregate functions can be used in a window frame
> context, thus any aggregate SQL functions should satisfy requirements for
> user defined window funcs.

Yes, that was a confusing description.
What it really means - we will have aggregate SQL functions (MDEV-7773)
and preferrably, it should be possible to make them window-aware.

Using an aggregate function as a window function, means applying it to
every possible position of a sliding frame, and it has O(N*n) complexity
(where N is total number of rows, n is the number of rows in the frame).
If the aggregate function can remove rows from a group, it will only be O(N).

> > * socket authentication
> Can you explain this.  Is there an MDEV?

That's using unix_socket authentication by default for the root user.
It improves security and maintainability, but is not really new - Debian
is already doing that for months. This change could be confusing and
break exising user scripts, so it needs to be done with care.

> > * X Protocol/Document store - only if people have time or money will
> > it be done
> If you want people to pay for it, perhaps you should implement it in
> MaxScale instead of the server :)

Interesting idea :)

But the MariaDB Meetup is a community event organized by the MariaDB
Foundation. This particular session was about MariaDB Server planning.
MariaDB Foundation has nothing to do with MaxScale, so we could not have
planned anything for it.

> >* Peter asks if there is any plans to support other languages like V8?
> PLEASE PLEASE PLEASE PLEASE IMPLEMENT WL-820.  External stored procs and
> table functions.  Been sitting there for the taking for a long time.
> Antony has tried to get you to get it into the server, but alas, it has
> never happened, and though there continues to be wide requests for this
> feature from the community, they fall on deaf ears.

Frankly speaking, I'd love it. But this feature never got enough
priority, not in MySQL times nor in MariaDB. Unfortunately.

Perhaps we'll be able to sneak it into 10.3, but no promises here.

> > * Compressed binary log (from Tencent)
> Compressing the binary log saves on space on the master, but it makes
> seeking into binary logs much more difficult, and searching backwards
> through them becomes much more difficult (which has implications for query
> 'rewind')

Tencent compresses individual events (think Compressed_query_log_event
type). So seeking works as before, events are not decompressed on
reading, they are sent compressed to slaves, etc. But the compression
ratio is worse, of course.

> > * Fix the XA transaction bug ( MySQL has fixed it already ) - MDEV-7974
> Please actually complete XA support, don't just half fix it like Oracle
> did.  Add full support for XA SUSPEND and XA RESUME and allow more than one
> thread to participate in a distributed transaction in the server.

Right. Still, MDEV-7974 is an important step, we cannot have proper XA
without it. XA SUSPEND and XA RESUME should be the next one, I agree.

> > * Indexes on expressions (this is part of virtual columns, will it not
> > go into 10.2? Check with Serg)
> Indexes on expressions requires parser support.
> create index expr_idx on some_table(a + b);
> select * from some_table where a+b > 30 and a+b <= 50 -- (uses range over
> expr_idx)

Right, but the parser support is the least of my worries. I don't want
to low ball estmates :) but the *parser* can be fixed in a few hours.
The most tricky part will be to fix the optimizer.

> > * Flashback DDL MDEV-10571 (flashback DML will come in 10.2). It only
> > works with row based replication. Talk about what to name it. There's
> > already a MySQL time machine on github. Many like the name Rewind (but
> > not Monty). Let's do a poll on the mailing list
> Okay, Flashback query is VERY complicated. A flashback query is a
> materialized view.  There are two ways generally to achieve flashback:
> a) materialize the query as is, and instead of rolling the query forward
> incrementally, you roll it backwards.  Flexviews achieves this by computing
> the query as it is now, then computing the delta from a prior point in time
> until the transaction in which that computation happened. Then the delta is
> played back against the query, but the MONUS of each operation is applied
> which changes insertions to deletions and vice versa.  This still presents
> a problem for OUTER JOIN.  No documented asynchronous refresh algorithms
> exist for outer join that I'm aware of, that would work with the concept of
> reading row history from serialized changes.  Flexviews uses the "rolling
> join propagation" algorithm, which only works with inner joins.

Interesting... I'll continue reading on that, thanks.
But see below

> b) provide a point-in-time snapshot of every table used in the query at the
> desired point in time, and run the query over those tables (this is a
> synchronous mechanism which supports OUTER JOIN).  The problem here is how
> to provide such a table.  The most straight-forward way is to copy the
> table, and then do the backwards replay for each (ideally in parallel) but
> this is obviously undesirable if the query is "select * from some_big_table
> join another_big_table on (...)" because you have to fully copy each table
> before you can undo changes.  Otherwise you need to have the SE display old
> versions, just like it does for MVCC, but it has to display versions from
> binary logs, not undo logs, and this requires a lot of SE and engine
> changes!

This is about correct. There are different applications for "flashback"
with different use cases and different implementation trade-offs.

One is, for example, to see historical sales numbers, for different
months or years. That is, basically, for some kind of data analytics.
Lots of SELECT queries, ad-hoc queries, at different historical time

Another one is "damn, I've made a typo in a WHERE clause and updated too
much". In this case one doesn't need joins or materialized views, one
needs to see the data before the erroneous statement. This is not used

For the second use case one can afford to copy tables and
backward-replay the binary log. For the first use case one would need a
completely different approach, I agree. May be something like what
Flexviews does.

> If you are going to have generic flashback, you might as well commit to
> incrementally refreshable materialized views.

Right, when we'll have materialized views, than we could think
about incrementally updating them using rolling join propagation.

> > * Additional GIS functions to stay compatible. Also it would be good
> > to have a standalone GIS library (Georg suggests; wlad isn't too happy
> > with the suggestion). Georg suggests that calculations should use the
> > reference systems (Unflatten the world - MariaDB Server GIS the world
> > looks flat)
> GIS functions should use gdal, just like postgresql does.  Then you can
> also add support for rasters.  Here is an example of the awesomeness of
> postgresql and postgis.  It uses SQL aggregate functions, table functions,
> CTE, GIS raster functions, sequences, etc:
> https://github.com/greenlion/osmvox/blob/master/postgresql/combined_schema.sql

May be. On the other hand, we're more precise that postgis, because our
implementation uses fixed-point math, not doubles.
And let's not forget that MySQL uses Boost::Geometry.

> > * Query rewriting - MDEV-5561
> I would like you to provide a SQL->DOM function call instead of just
> providing a DOM to plugins.  This function could be exposed as a regular
> item function as well so that anything in the server can parse SQL. I
> suggest you implement my SQL "shim" interface and just provide some way to
> get a easy to iterate over parsed data structure from the SQL, such as a
> nested JSON array of objects.  A nested array is generated by
> PHP-SQL-Parser and would provide a good template for such a JSON object.
> https://github.com/greenlion/PHP-SQL-Parser

We actually have an MDEV for that :)

> > * With MyRocks coming, should we drop TokuDB (and maybe even deprecate
> > in 10.2?) - bugs that MariaDB Corporation reports to Percona don't
> > seem to get fixed. Peter says that bugs are fixed for customers... and
> > there is ongoing development to make it better
> It is certainly disheartening that Percona isn't responsive to MariaDB
> bugs, but I'm sure you understand that it is hard for a competitor to fix
> bugs for another competitor.  MariaDB maintains a forked version of
> TokuDB.  It isn't fair to expect the upstream vendor to fix bugs that your
> customers are paying you to support, does it?  Perhaps you should pay a
> percentage of your support fees for TokuDB issues to Percona, or come to
> some other support agreement.  Perhaps YOU should make the bug fixes and
> submit them to Percona.

And we do, look for bug reports I've reported on TokuDB to Percona -
with patches :)

But, really, Percona *is* fixing TokuDB bugs, it's just that they did a
bit of refactoring after getting TokuDB and it took time for it to
stabilize. And MariaDB Server has a vanilla TokuDB with almost no
changes, we have no plans to fork it.

> > * ORDER BY LIMIT optimizer bugs (MDEV-8306)
> Oh god, this is a downward spiral every time somebody touches it.
> Fixing it correctly requires rewriting the parser, something that
> Oracle is undertaking.

This didn't have much to do with the parser, and we've refactored that
part of the parser in 10.2, so now this has nothing to do with the
parser, it's purely the optimizer issue.

> What about other new MySQL 8 features?  Are you getting rid of the .FRM
> nightmare?  Are you going to support SET PERSIST?  etc?

FRM was made purely optional in 10.0 - every storage engine decides for
itself whether it uses FRMs or not. May be InnoDB will use FRMs in 10.3,
may be it will not. MyISAM most probably will continue use them.

SET PERSIST - and this is my personal opinion - this needs to be thought
over *very* carefully. There have been quite a few security
vulnerabilities with my.cnf stored in the datadir, and that's why since
2005 the server no longer reads my.cnf in the datadir. But mysql_safe
still does - and there have been new security vulnerabilities *last
month*, caused by my.cnf in the datadir. And finally both MySQL (in
5.7?) and MariaDB (in 10.2) stopped reading my.cnf in the datadir for
real.  So, having my security@xxxxxxxxxxx hat on, I look at SET PERSIST
with a lot of suspicion, because it's nothing else than another attempt
of storing server configuration information in the datadir and have it
writable by the server itself.

Chief Architect MariaDB
and security@xxxxxxxxxxx