← Back to team overview

maria-developers team mailing list archive

Re: Why local modifications to PCRE library?

 



On 11/16/2013 09:11 PM, Pavel Ivanov wrote:
On Sat, Nov 16, 2013 at 2:39 AM, Alexander Barkov <bar@xxxxxxxxxxx> wrote:
I see you've included the full source of PCRE library into MariaDB
tree and even made some local modifications which are not included
into PCRE upstream. Could you please explain why did you do that? Why
not allow to build with system standard PCRE library?

I understand that you could have needed some modifications and thus
could want to include sources from PCRE trunk to allow users to build
with modifications when their system-wide library is too old. But as I
don't see your modifications on PCRE trunk that means you probably had
something different in mind, and decided to incur the pain of
constantly updating the included sources and merging security fixes.
So why did you decide to do that?


There is a bug in the PCRE library. It crashes in the function
compile_regex() because of the stack overrun in case when the pattern
consists of a deep enough parenthesizes level like this:

SELECT 'x' RLIKE CONCAT(REPEAT('(',300), 'x', REPEAT(')',300));

I reported the problem to Philip Hazel (the author and the main maintainer
of PCRE). Philip already made a fix in the PCRE sources
to address this issue.

The Philip's fix is different comparing to our version, this is
probably why you did not recognize it in the PCRE's trunk.

Our fix makes compile_regex() watch the available stack size through
a callback function and stop the recursion when the execution is near
to run out of stack. It works very well with all possible values of
the thread_stack MariaDB system variable, which is quite small by
default (256 Kb).

The Philip's version of the fix limits recursion depth to 250,
which should be enough for any reasonable regular expression,
and should *hopefully* not run out of stack.
This fix will be included into the next release PCRE-8.34,
which is planned around Christmas time.

After the fixed version of PCRE is released we'll check if
it solves the problem on all supported platforms with the
minimum possible value of the MariaDB system variable "thread_stack".
In case it does, we'll add an option to compile MariaDB against
the system installed PCRE library instead of the bundled version.
Otherwise, we'll report to Philip again asking again to consider
including our fix.

That makes sense. And I saw the commit in PCRE trunk
http://vcs.pcre.org/viewvc?view=revision&revision=1389 and thought it
is related. But I wonder why you don't take that change now, apply it
to the local PCRE sources and test if it works for MariaDB without
waiting for 8.34 release? If it works you can include sources with
that change to avoid anybody's confusion, if it doesn't work Philip
will be able to fix that before 8.34 release. Did you consider doing
this?

Sorry, I did not. Being busy with something else now.
The bundled PCRE version works fine, and we don't have any
hurry to upgrade to a newer version.

I'm not sure why bundling PCRE with some our fixes should
be confusing. We bundle many libraries with our own fixes.
Before the 10.0.5 release we've bundled the Henry Spenser's regex library with our own modifications for many many years.
No one was ever confused about that :)
So why do you think it's confusing?



But besides the fix for deep parenthesis there's another difference.
pcreposix.h in MariaDB has macros REG_ATOI and REG_ITOA. They are used
in MariaDB and they are not in PCRE 8.33 and not in PCRE trunk. Why
these macros were included here and not e.g. in my_global.h or some
more narrow-scoped header?

This particular change can be removed. It's not really needed anymore.
I'll cleanup this at the same time with upgrading to PCRE-8.34 and
adding support for an external PCRE library.
Just created a task for this:

https://mariadb.atlassian.net/browse/MDEV-5304



Pavel



Follow ups

References