← Back to team overview

maria-developers team mailing list archive

Re: GSoC:Regex Project


Hi Tamás,

On 04/17/2013 09:53 PM, Tamás Kövesdan wrote:

I'm interested in this project. I have read the other thread about
this. I've checked the available regex implementations and I've
written a draft application. Is there any potential mentor who would
review my application and help to improve it?

Thanks for your interest in this project!

Please feel free to send your application to me.

I'll try to write a detailed list of requirements for this
task soon. Here is a brief list:

1. Ideally, after replacing the regex library,
the regular expression functions should be able to:

a. Work with all MariaDB character sets:
- 8bit
- Unicode: utf8, utf16, utf16le, utf32
- Asian multi-byte: sjis, cp932, ujis, eucjpms, gbk, gb2312, euckr)

b. Follow the comparison rules defined in MariaDB collation,
i.e. take into account things like case and accent sensitivity
for the current collation:

SELECT 'o' RLIKE '<o with diaeresis>' COLLATE utf8_unicode_ci   -> TRUE
SELECT 'o' RLIKE '<o with diaeresis>' COLLATE utf8_hungarian_ci -> FALSE

c. Support modern regex features like look-aheads/look-behinds,
non-greedy modifiers, may be even recursion, etc.

2. The library should be distributed under some permissive library
(e.g. LGPL, BSD, MIT, etc). GPL is not desirable.


Thanks in advance.

Best whishes,
Tamás Kövesdán

Mailing list: https://launchpad.net/~maria-developers
Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp

Follow ups