maria-discuss team mailing list archive

Thread
Date

Re: MariaDB Server 10.3 notes

To: Peter Laursen <peter_laursen@xxxxxxxxxx>
From: Andrew Hutchings <andrew@xxxxxxxxxxxxxxx>
Date: Thu, 13 Oct 2016 17:53:44 +0100
Cc: Maria Discuss <maria-discuss@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CAFa9YRBTna2Cb=nFG-mC52MFaDeToB3pK7Pj-xAStsorUOotFg@mail.gmail.com>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0

Well, nothing should be using Python 2.x any more, period. But that isdiscussion for another time :)

I get where you are coming from though. I can see both sides of theargument. Personally I prefer to have that logic closer to theapplication layer anyway. Hence my suggestion of doing it in a proxy.


Kind Regards
Andrew

On 13/10/16 16:38, Peter Laursen wrote:

Personally I think (as told a few times before when this discussion
arrived in this mailing list) that such feature should not be an
*external language* as it should not IMO use what is installed
independentlyof MysqL/MariaDB on the system. This lanugage intepreters
should ship with the server  - possible as an optional plugin.  This in
order to ensure portability.  If 2 systems use different Python versions
for instance issues are likely when porting from one to another.


-- Peter

On Thu, Oct 13, 2016 at 5:26 PM, Andrew Hutchings
<andrew@xxxxxxxxxxxxxxx <mailto:andrew@xxxxxxxxxxxxxxx>> wrote:

    Whilst I agree it would be a nice feature and something I've heard
    for years, the implementation would likely be more complex than it
    seems.

    For starters you have to consider the possibility of a procedure
    crashing and taking the whole daemon with it. You would likely need
    to fork a small worker process pool and have some kind of shared
    memory or socket communications for safety. In addition the
    implementation would have to be extremely careful not to add any
    potential security hole due to a zero-day in PHP or some bad input
    filtering for example.

    I'm not saying it is impossible, but it will likely be a lot of work
    to get right and the APIs would need to be carefully thought out.

    So the question becomes: is it worth spending time developing this
    over another feature? Or is it something that could be better
    implemented safely in another layer, such as in a database proxy?

    Kind Regards
    Andrew

    On 13/10/16 07:04, Federico Razzoli wrote:

        Hi all,

        So basically everyone would love, love, love to have external
        languages for stored procedures, but no one is working on it...
        so bad. Please consider something:
        1- Some features could be implemented as stored procedures, it's
        much easier. This has been done in the past (Flexviews,
        Securich...) but SQL is too limited.
        2- I am sure that a lot of people would implement procedures
        libraries if they could use something like JavaScript or PHP. If
        we could use Python, stuff like NumPy and SciPy could be used.
        3- SQL limitations could be lifted (namespaces for global
        objects, arrays, argv, cursors based on a prepared statement...)
        but will you ever do it? Probably not. But if we have external
        languages, who cares about procedural SQL limitations.

        I believe that next releases features are selected based on
        their cost (not only that of course). But please consider points
        2 and 3, and try to estimate the cost of features that the
        community can develop for you, plus the cost of features that
        won't be really needed anymore if external languages are
        available (arrays in SQL).

        Cheers
        Federico




        --------------------------------------------
        Mer 12/10/16, Sergei Golubchik <serg@xxxxxxxxxxx
        <mailto:serg@xxxxxxxxxxx>> ha scritto:

         Oggetto: Re: [Maria-discuss] MariaDB Server 10.3 notes
         A: "Justin Swanhart" <greenlion@xxxxxxxxx
        <mailto:greenlion@xxxxxxxxx>>
         Cc: "Maria Discuss" <maria-discuss@xxxxxxxxxxxxxxxxxxx
        <mailto:maria-discuss@xxxxxxxxxxxxxxxxxxx>>
         Data: Mercoledì 12 ottobre 2016, 14:15

         Hi, Justin!

         Very good questions,
         thanks!
         Some answers below:

         On Oct 12, Justin Swanhart
         wrote:
         >
         > > *
         InnoDB: InnoDB native partitioning - so MySQL 8 InnoDB? But
         Monty
         > > says there's next to no
         changes in InnoDB 8...  Instant add column.
         > > New InnoDB deadlock detection (8.0).
         New INFORMATION_SCHEMA table
         > >
         (8.0). Dedicated tablespace for temporary tables (in 5.7 and
         merged,
         > > check). Lock wait policy
         (contribution)
         >
         >
         Monty is *notorious* for low balling estimates.  His famous
         phrase is "it
         > is trivial".
         Everybody knows that if Monty says it is trivial, you can
         add
         > 10x the work to get it done, or
         more.

         Yes, I tend to agree
         with that. But Monty estimates are not that far off
         when applied to *Monty*. They're often too
         low when applied to others.

         https://www.google.de/search?q=10x+developer
        <https://www.google.de/search?q=10x+developer>

         > includes transportable
         tablespaces for partitioned tables, and will likely
         > support native partitioning by the time 8
         rolls.  Native partitioning also
         >
         entails implementing the changes to the SE.  See:
         >
        http://mysqlserverteam.com/innodb-native-partitioning-early-access/
        <http://mysqlserverteam.com/innodb-native-partitioning-early-access/>

         InnoDB native partitioning is
         in 5.7.6:
         https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-6.html
        <https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-6.html>

         Is it better than upper-layer
         partitioning? Why?

         >
         > * more for window functions (user defined window
         functions, MDEV-10855)
         > What other
         databases have user definable window functions?  Will the
         syntax
         > be similar?  I can't really
         think of a reason to need custom window
         >
         functions personally, as the existing set is very
         comprehensive.  What kind
         > of custom
         window function do you think people would need to write?
         In most
         > implementations any aggregate
         functions can be used in a window frame
         >
         context, thus any aggregate SQL functions should satisfy
         requirements for
         > user defined window
         funcs.

         Yes, that was a
         confusing description.
         What it really means
         - we will have aggregate SQL functions (MDEV-7773)
         and preferrably, it should be possible to make
         them window-aware.

         Using an
         aggregate function as a window function, means applying it
         to
         every possible position of a sliding
         frame, and it has O(N*n) complexity
         (where N

         is total number of rows, n is the number of rows in the
         frame).
         If the aggregate function can remove
         rows from a group, it will only be O(N).

         > > * socket authentication
         > Can you explain this.  Is there an
         MDEV?

         That's using
         unix_socket authentication by default for the root user.
         It improves security and maintainability, but
         is not really new - Debian
         is already doing
         that for months. This change could be confusing and
         break exising user scripts, so it needs to be
         done with care.

         > > *
         X Protocol/Document store - only if people have time or
         money will
         > > it be done
         > If you want people to pay for it, perhaps
         you should implement it in
         > MaxScale
         instead of the server :)

         Interesting idea :)

         But the MariaDB Meetup is a community event
         organized by the MariaDB
         Foundation. This
         particular session was about MariaDB Server planning.
         MariaDB Foundation has nothing to do with
         MaxScale, so we could not have
         planned
         anything for it.

         > >*
         Peter asks if there is any plans to support other languages
         like V8?
         > PLEASE PLEASE PLEASE PLEASE
         IMPLEMENT WL-820.  External stored procs and
         > table functions.  Been sitting there for
         the taking for a long time.
         > Antony has
         tried to get you to get it into the server, but alas, it
         has
         > never happened, and though there
         continues to be wide requests for this
         >
         feature from the community, they fall on deaf ears.

         Frankly speaking, I'd love
         it. But this feature never got enough
         priority, not in MySQL times nor in MariaDB.
         Unfortunately.

         Perhaps
         we'll be able to sneak it into 10.3, but no promises
         here.

         > > *
         Compressed binary log (from Tencent)
         >
         Compressing the binary log saves on space on the master, but
         it makes
         > seeking into binary logs much
         more difficult, and searching backwards
         >
         through them becomes much more difficult (which has
         implications for query
         >
         'rewind')

         Tencent
         compresses individual events (think
         Compressed_query_log_event
         type). So seeking
         works as before, events are not decompressed on
         reading, they are sent compressed to slaves,
         etc. But the compression
         ratio is worse, of
         course.

         > > * Fix the
         XA transaction bug ( MySQL has fixed it already ) -
         MDEV-7974
         > Please actually complete XA
         support, don't just half fix it like Oracle
         > did.  Add full support for XA SUSPEND and
         XA RESUME and allow more than one
         >
         thread to participate in a distributed transaction in the
         server.

         Right. Still,
         MDEV-7974 is an important step, we cannot have proper XA
         without it. XA SUSPEND and XA RESUME should be
         the next one, I agree.

         >
         > * Indexes on expressions (this is part of virtual
         columns, will it not
         > > go into 10.2?
         Check with Serg)
         > Indexes on expressions
         requires parser support.
         > create index
         expr_idx on some_table(a + b);
         > select *
         from some_table where a+b > 30 and a+b <= 50 -- (uses
         range over
         > expr_idx)

         Right, but the parser support
         is the least of my worries. I don't want
         to low ball estmates :) but the *parser* can be
         fixed in a few hours.
         The most tricky part
         will be to fix the optimizer.

         > > * Flashback DDL MDEV-10571 (flashback
         DML will come in 10.2). It only
         > >
         works with row based replication. Talk about what to name
         it. There's
         > > already a MySQL
         time machine on github. Many like the name Rewind (but
         > > not Monty). Let's do a poll on
         the mailing list
         > Okay, Flashback query
         is VERY complicated. A flashback query is a
         > materialized view.  There are two ways
         generally to achieve flashback:
         > a)
         materialize the query as is, and instead of rolling the
         query forward
         > incrementally, you roll
         it backwards.  Flexviews achieves this by computing
         > the query as it is now, then computing the
         delta from a prior point in time
         > until
         the transaction in which that computation happened. Then the
         delta is
         > played back against the query,
         but the MONUS of each operation is applied
         > which changes insertions to deletions and
         vice versa.  This still presents
         > a
         problem for OUTER JOIN.  No documented asynchronous refresh
         algorithms
         > exist for outer join that
         I'm aware of, that would work with the concept of
         > reading row history from serialized
         changes.  Flexviews uses the "rolling
         > join propagation" algorithm, which
         only works with inner joins.

         Interesting... I'll continue reading on
         that, thanks.
         But see below

         > b) provide a
         point-in-time snapshot of every table used in the query at
         the
         > desired point in time, and run the
         query over those tables (this is a
         >
         synchronous mechanism which supports OUTER JOIN).  The
         problem here is how
         > to provide such a
         table.  The most straight-forward way is to copy the
         > table, and then do the backwards replay
         for each (ideally in parallel) but
         > this
         is obviously undesirable if the query is "select * from
         some_big_table
         > join another_big_table
         on (...)" because you have to fully copy each table
         > before you can undo changes.  Otherwise
         you need to have the SE display old
         >
         versions, just like it does for MVCC, but it has to display
         versions from
         > binary logs, not undo
         logs, and this requires a lot of SE and engine
         > changes!

         This is about correct. There are different
         applications for "flashback"
         with
         different use cases and different implementation
         trade-offs.

         One is, for
         example, to see historical sales numbers, for different
         months or years. That is, basically, for some
         kind of data analytics.
         Lots of SELECT
         queries, ad-hoc queries, at different historical time
         points.

         Another one is "damn, I've made a typo
         in a WHERE clause and updated too
         much". In this case one doesn't need
         joins or materialized views, one
         needs to
         see the data before the erroneous statement. This is not
         used
         often.

         For the second use case one can afford to copy
         tables and
         backward-replay the binary log.
         For the first use case one would need a
         completely different approach, I agree. May be
         something like what
         Flexviews does.

         > If you are going to have
         generic flashback, you might as well commit to
         > incrementally refreshable materialized
         views.

         Right, when
         we'll have materialized views, than we could think
         about incrementally updating them using rolling
         join propagation.

         > >
         * Additional GIS functions to stay compatible. Also it would
         be good
         > > to have a standalone GIS
         library (Georg suggests; wlad isn't too happy
         > > with the suggestion). Georg suggests
         that calculations should use the
         > >
         reference systems (Unflatten the world - MariaDB Server GIS
         the world
         > > looks flat)
         >
         > GIS functions should
         use gdal, just like postgresql does.  Then you can
         > also add support for rasters.  Here is an
         example of the awesomeness of
         >
         postgresql and postgis.  It uses SQL aggregate functions,
         table functions,
         > CTE, GIS raster
         functions, sequences, etc:
         >
        https://github.com/greenlion/osmvox/blob/master/postgresql/combined_schema.sql
        <https://github.com/greenlion/osmvox/blob/master/postgresql/combined_schema.sql>

         May be. On the other hand,
         we're more precise that postgis, because our
         implementation uses fixed-point math, not
         doubles.
         And let's not forget that MySQL
         uses Boost::Geometry.

         >
         > * Query rewriting - MDEV-5561
         > I
         would like you to provide a SQL->DOM function call
         instead of just
         > providing a DOM to
         plugins.  This function could be exposed as a regular
         > item function as well so that anything in
         the server can parse SQL. I
         > suggest you
         implement my SQL "shim" interface and just provide
         some way to
         > get a easy to iterate over
         parsed data structure from the SQL, such as a
         > nested JSON array of objects.  A nested
         array is generated by
         > PHP-SQL-Parser
         and would provide a good template for such a JSON object.
         > https://github.com/greenlion/PHP-SQL-Parser
        <https://github.com/greenlion/PHP-SQL-Parser>

         We actually have an MDEV for
         that :)

         > > * With
         MyRocks coming, should we drop TokuDB (and maybe even
         deprecate
         > > in 10.2?) - bugs that
         MariaDB Corporation reports to Percona don't
         > > seem to get fixed. Peter says that
         bugs are fixed for customers... and
         >
         > there is ongoing development to make it better
         > It is certainly disheartening that Percona
         isn't responsive to MariaDB
         > bugs,
         but I'm sure you understand that it is hard for a
         competitor to fix
         > bugs for another
         competitor.  MariaDB maintains a forked version of
         > TokuDB.  It isn't fair to expect the
         upstream vendor to fix bugs that your
         >
         customers are paying you to support, does it?  Perhaps you
         should pay a
         > percentage of your support
         fees for TokuDB issues to Percona, or come to
         > some other support agreement.  Perhaps
         YOU should make the bug fixes and
         >
         submit them to Percona.

         And
         we do, look for bug reports I've reported on TokuDB to
         Percona -
         with patches :)

         But, really, Percona *is*
         fixing TokuDB bugs, it's just that they did a
         bit of refactoring after getting TokuDB and it
         took time for it to
         stabilize. And MariaDB
         Server has a vanilla TokuDB with almost no
         changes, we have no plans to fork it.

         > > * ORDER BY LIMIT
         optimizer bugs (MDEV-8306)
         > Oh god, this
         is a downward spiral every time somebody touches it.
         > Fixing it correctly requires rewriting the
         parser, something that
         > Oracle is
         undertaking.

         This
         didn't have much to do with the parser, and we've
         refactored that
         part of the parser in 10.2,
         so now this has nothing to do with the
         parser, it's purely the optimizer issue.

         > What about other new
         MySQL 8 features?  Are you getting rid of the .FRM
         > nightmare?  Are you going to support SET
         PERSIST?  etc?

         FRM was
         made purely optional in 10.0 - every storage engine decides
         for
         itself whether it uses FRMs or not. May
         be InnoDB will use FRMs in 10.3,
         may be it
         will not. MyISAM most probably will continue use them.

         SET PERSIST - and this is my
         personal opinion - this needs to be thought
         over *very* carefully. There have been quite a
         few security
         vulnerabilities with my.cnf
         stored in the datadir, and that's why since
         2005 the server no longer reads my.cnf in the
         datadir. But mysql_safe
         still does - and
         there have been new security vulnerabilities *last
         month*, caused by my.cnf in the datadir. And
         finally both MySQL (in
         5.7?) and MariaDB (in
         10.2) stopped reading my.cnf in the datadir for
         real.  So, having my security@xxxxxxxxxxx
        <mailto:security@xxxxxxxxxxx>
         hat on, I look at SET PERSIST
         with a lot of
         suspicion, because it's nothing else than another
         attempt
         of storing server configuration
         information in the datadir and have it
         writable by the server itself.

         Regards,
         Sergei
         Chief Architect
         MariaDB
         and security@xxxxxxxxxxx <mailto:security@xxxxxxxxxxx>

         _______________________________________________
         Mailing list: https://launchpad.net/~maria-discuss
        <https://launchpad.net/~maria-discuss>
         Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
        <mailto:maria-discuss@xxxxxxxxxxxxxxxxxxx>
         Unsubscribe : https://launchpad.net/~maria-discuss
        <https://launchpad.net/~maria-discuss>
         More help   : https://help.launchpad.net/ListHelp
        <https://help.launchpad.net/ListHelp>


        _______________________________________________
        Mailing list: https://launchpad.net/~maria-discuss
        <https://launchpad.net/~maria-discuss>
        Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
        <mailto:maria-discuss@xxxxxxxxxxxxxxxxxxx>
        Unsubscribe : https://launchpad.net/~maria-discuss
        <https://launchpad.net/~maria-discuss>
        More help   : https://help.launchpad.net/ListHelp
        <https://help.launchpad.net/ListHelp>


    --
    Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/


    _______________________________________________
    Mailing list: https://launchpad.net/~maria-discuss
    <https://launchpad.net/~maria-discuss>
    Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
    <mailto:maria-discuss@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~maria-discuss
    <https://launchpad.net/~maria-discuss>
    More help   : https://help.launchpad.net/ListHelp
    <https://help.launchpad.net/ListHelp>


--
Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/

References

Re: MariaDB Server 10.3 notes
From: Federico Razzoli, 2016-10-13
Re: MariaDB Server 10.3 notes
From: Andrew Hutchings, 2016-10-13
Re: MariaDB Server 10.3 notes
From: Peter Laursen, 2016-10-13