← Back to team overview

maria-developers team mailing list archive

Re: [Gsoc] Regex enhancements Project

 

Hi Kentoku,

Oniguruma is looking great. But I can't find if the following features
are implemented on Oniguruma or not(even the Wikipedia page  doesn't
have those information). Any Idea where I can find those?

* look-aheads/look-behinds,
* non-greedy modifiers
* recursion

thanks.



On Sun, Apr 21, 2013 at 12:00 PM, Sudheera Palihakkara
<catchsudheera@xxxxxxxxx> wrote:
> Hi Kentoku,
>
> thank you, I will surely study about Oniguruma..! :)
>
>
> On Sat, Apr 20, 2013 at 11:26 PM, kentoku <kentokushiba@xxxxxxxxx> wrote:
>>
>> Hi Sudheera and Sergei,
>>
>> > In case of I missed some libraries, I guess you will enlighten me to
>> > study about them too. considering the requirements I didn't see Asian
>> > multi-byte support implemented in anywhere, what would we do about that.?
>>
>> Do you know "oniguruma"?
>> http://www.geocities.jp/kosako3/oniguruma/
>> http://en.wikipedia.org/wiki/Oniguruma
>>
>> Oniguruma is a regular expressions library, that supports multi-byte
>> character sets like big5, euc-kr and shift_jis. Oniguruma is used by
>> "mregexp". "Mregexp" is a multi-byte support regex UDF for MySQL. So, I
>> think you can understand easily about how to use it.
>>
>> Thanks,
>> Kentoku
>>
>>
>>
>>
>> 2013/4/20 Sudheera Palihakkara <catchsudheera@xxxxxxxxx>
>>>
>>> Hello Sir,
>>>
>>> I've been working on this project for the past couple of days. I found
>>> that there are few good regex libraries suitable for this task. Considering
>>> the requirements I think PCRE, ICU regex and RGX would do the job. But ICU
>>> regex doesn't have recursion but it has well-documented easy-to-understand
>>> code. Currently I think PCRE is the best option we can have.
>>>
>>> In case of I missed some libraries, I guess you will enlighten me to
>>> study about them too. considering the requirements I didn't see Asian
>>> multi-byte support implemented in anywhere, what would we do about that.?
>>>
>>> In the google-melange page, under the application template there is a
>>> field called "Project description", what should I include there.? i mean do
>>> you expect a full description about the project including figures or just a
>>> brief just like in projects ideas page.
>>>
>>> Thank you.
>>>
>>>
>>> On Fri, Apr 19, 2013 at 3:46 PM, Sergei Golubchik <serg@xxxxxxxxxxxx>
>>> wrote:
>>>>
>>>> Hi, Sudheera!
>>>>
>>>> On Apr 19, Sudheera Palihakkara wrote:
>>>> > Hi,
>>>> > I went through other threads on this topic. In one thread you
>>>> > mentioned to
>>>> > choose a suitable regex library.
>>>> >
>>>> > *( Preliminary research - only about chosing a regex library to use in
>>>> > MariaDB. You should be able to explain why we should use this library
>>>> > and
>>>> > not some other one.)
>>>> >
>>>> > *
>>>> > What do you mean by "choosing"? don't we have to enhance the exiting
>>>> > regex
>>>> > library? Or choose from exiting already implemented libraries which
>>>> > are
>>>> > free to use? sorry if it's a stupid question, but I'm confused. :O
>>>>
>>>> Enhancing our old regex library to support all modern features and
>>>> multiple charsets is complex and bug-prone work.
>>>>
>>>> I don't see why we should bother doing it, when there are plenty of
>>>> regex libraries available.
>>>>
>>>> There's PHP's mb_regex, there's prce, and many others too. We'd better
>>>> just pick one that works better for MariaDB, and put it instead of
>>>> Henry Spencer's library.
>>>>
>>>> Regards,
>>>> Sergei
>>>>
>>>> P.S. Please, don't reply to me only, use reply-to-all, so that your
>>>> mails appear on the mailing list.
>>>
>>>
>>>
>>>
>>> --
>>> Sudheera Palihakkara.
>>> Undergraduate
>>> Department of Computer Science and Engineering,
>>> Faculty of Engineering,
>>> University of Moratuwa,
>>> Sri Lanka.
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~maria-developers
>>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~maria-developers
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>
>
>
> --
> Sudheera Palihakkara.
> Undergraduate
> Department of Computer Science and Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.



-- 
Sudheera Palihakkara.
Undergraduate
Department of Computer Science and Engineering,
Faculty of Engineering,
University of Moratuwa,
Sri Lanka.


Follow ups

References