maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #07184
Re: mdev6027 RLIKE: "." no longer matching new line (default_regex_flags)
Hi Jan, Sergei,
On 04/23/2014 11:25 AM, Sergei Golubchik wrote:
Hi, Alexander!
On Apr 22, Sergei Golubchik wrote:
On Apr 17, Alexander Barkov wrote:
Hello Serg,
Please review a patch implementing a new system variable
default_regex_flags, to address the remaining incompatibilities
between PCRE and the old regex library.
Ah, something else.
Please, make sure this new variable is documented.
Yeah, I just finished writing a description for Jan :)
Regards,
Sergei
Jan, can you please update the manual?
A new system variable default_regexp_flags was added,
to set the default behaviour of the PCRE regex engine.
Scope: global, session.
Affected functions and operators: RLIKE, REGEXP_SUBSTR, REGEXP_REPLACE.
Possible values: any combination of zero or more of the following
options, comma separated:
DOTALL
DUPNAMES
EXTENDED
EXTRA
MULTILINE
UNGREEDY
Default value: empty (all options are off).
Example:
SET default_regex_flags='';
SET default_regex_flags='DOTALL';
SET default_regex_flags='DOTALL,DUPNAMES,EXTENDED,EXTRA,MULTILINE,UNGREEDY';
The meaning of the values:
Value Pattern equivalent Meaning
--------- ------------------ -------
DOTALL (?s) . matches anything including NL
DUPNAMES (?J) Allow duplicate names for subpatterns
EXTENDED (?x) Ignore white space and # comments
EXTRA (?X) extra features (e.g. error on unknown
escape character)
MULTILINE (?m) ^ and $ match newlines within data
UNGREEDY (?U) Invert greediness of quantifiers
See here for the list of the equivalent PCRE options:
https://mariadb.com/kb/en/pcre-regular-expressions/#option-setting
Examples:
# The default behaviour (multiline match is off)
mysql> SELECT 'a\nb\nc' RLIKE '^b$';
+---------------------------+
| '(?m)a\nb\nc' RLIKE '^b$' |
+---------------------------+
| 0 |
+---------------------------+
# Enabling the multiline option using the PCRE option syntax:
mysql> SELECT 'a\nb\nc' RLIKE '(?m)^b$';
+---------------------------+
| 'a\nb\nc' RLIKE '(?m)^b$' |
+---------------------------+
| 1 |
+---------------------------+
# Enabling the miltiline option using default_regex_flags
mysql> SET default_regex_flags='MULTILINE';
mysql> SELECT 'a\nb\nc' RLIKE '^b$';
+-----------------------+
| 'a\nb\nc' RLIKE '^b$' |
+-----------------------+
| 1 |
+-----------------------+
The goal of the new variable is to simplify writing PCRE patterns,
as well as to have a way to configure the default behaviour of the PCRE
engine in a more compatible way with the old regex engine used in
MariaDB-5.5 and MySQL.
Note, unlike the old regex engine, dot (.) does not match a
new line character in PCRE by default. Those who need a better
compatibility with the old regex engine might consider adding this
command into /etc/my.cnf:
[mysqld]
default-regex-flags=DOTALL
Thanks.
Follow ups
References