← Back to team overview

maria-developers team mailing list archive

Re: [Gsoc] Regex enhancements Project

 

Hi Sudheera and Sergei,

Sudheera,
How much do you want to prepare for this project before starting GSOC?

Serugei,
How can I and other people help this project after starting GSOC? Is it
same as now?

Thanks,
Kentoku


2013/4/21 Sudheera Palihakkara <catchsudheera@xxxxxxxxx>

> Hi Kentoku,
>
> Oniguruma is looking great. But I can't find if the following features
> are implemented on Oniguruma or not(even the Wikipedia page  doesn't
> have those information). Any Idea where I can find those?
>
> * look-aheads/look-behinds,
> * non-greedy modifiers
> * recursion
>
> thanks.
>
>
>
> On Sun, Apr 21, 2013 at 12:00 PM, Sudheera Palihakkara
> <catchsudheera@xxxxxxxxx> wrote:
> > Hi Kentoku,
> >
> > thank you, I will surely study about Oniguruma..! :)
> >
> >
> > On Sat, Apr 20, 2013 at 11:26 PM, kentoku <kentokushiba@xxxxxxxxx>
> wrote:
> >>
> >> Hi Sudheera and Sergei,
> >>
> >> > In case of I missed some libraries, I guess you will enlighten me to
> >> > study about them too. considering the requirements I didn't see Asian
> >> > multi-byte support implemented in anywhere, what would we do about
> that.?
> >>
> >> Do you know "oniguruma"?
> >> http://www.geocities.jp/kosako3/oniguruma/
> >> http://en.wikipedia.org/wiki/Oniguruma
> >>
> >> Oniguruma is a regular expressions library, that supports multi-byte
> >> character sets like big5, euc-kr and shift_jis. Oniguruma is used by
> >> "mregexp". "Mregexp" is a multi-byte support regex UDF for MySQL. So, I
> >> think you can understand easily about how to use it.
> >>
> >> Thanks,
> >> Kentoku
> >>
> >>
> >>
> >>
> >> 2013/4/20 Sudheera Palihakkara <catchsudheera@xxxxxxxxx>
> >>>
> >>> Hello Sir,
> >>>
> >>> I've been working on this project for the past couple of days. I found
> >>> that there are few good regex libraries suitable for this task.
> Considering
> >>> the requirements I think PCRE, ICU regex and RGX would do the job. But
> ICU
> >>> regex doesn't have recursion but it has well-documented
> easy-to-understand
> >>> code. Currently I think PCRE is the best option we can have.
> >>>
> >>> In case of I missed some libraries, I guess you will enlighten me to
> >>> study about them too. considering the requirements I didn't see Asian
> >>> multi-byte support implemented in anywhere, what would we do about
> that.?
> >>>
> >>> In the google-melange page, under the application template there is a
> >>> field called "Project description", what should I include there.? i
> mean do
> >>> you expect a full description about the project including figures or
> just a
> >>> brief just like in projects ideas page.
> >>>
> >>> Thank you.
> >>>
> >>>
> >>> On Fri, Apr 19, 2013 at 3:46 PM, Sergei Golubchik <serg@xxxxxxxxxxxx>
> >>> wrote:
> >>>>
> >>>> Hi, Sudheera!
> >>>>
> >>>> On Apr 19, Sudheera Palihakkara wrote:
> >>>> > Hi,
> >>>> > I went through other threads on this topic. In one thread you
> >>>> > mentioned to
> >>>> > choose a suitable regex library.
> >>>> >
> >>>> > *( Preliminary research - only about chosing a regex library to use
> in
> >>>> > MariaDB. You should be able to explain why we should use this
> library
> >>>> > and
> >>>> > not some other one.)
> >>>> >
> >>>> > *
> >>>> > What do you mean by "choosing"? don't we have to enhance the exiting
> >>>> > regex
> >>>> > library? Or choose from exiting already implemented libraries which
> >>>> > are
> >>>> > free to use? sorry if it's a stupid question, but I'm confused. :O
> >>>>
> >>>> Enhancing our old regex library to support all modern features and
> >>>> multiple charsets is complex and bug-prone work.
> >>>>
> >>>> I don't see why we should bother doing it, when there are plenty of
> >>>> regex libraries available.
> >>>>
> >>>> There's PHP's mb_regex, there's prce, and many others too. We'd better
> >>>> just pick one that works better for MariaDB, and put it instead of
> >>>> Henry Spencer's library.
> >>>>
> >>>> Regards,
> >>>> Sergei
> >>>>
> >>>> P.S. Please, don't reply to me only, use reply-to-all, so that your
> >>>> mails appear on the mailing list.
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Sudheera Palihakkara.
> >>> Undergraduate
> >>> Department of Computer Science and Engineering,
> >>> Faculty of Engineering,
> >>> University of Moratuwa,
> >>> Sri Lanka.
> >>>
> >>> _______________________________________________
> >>> Mailing list: https://launchpad.net/~maria-developers
> >>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
> >>> Unsubscribe : https://launchpad.net/~maria-developers
> >>> More help   : https://help.launchpad.net/ListHelp
> >>>
> >>
> >
> >
> >
> > --
> > Sudheera Palihakkara.
> > Undergraduate
> > Department of Computer Science and Engineering,
> > Faculty of Engineering,
> > University of Moratuwa,
> > Sri Lanka.
>
>
>
> --
> Sudheera Palihakkara.
> Undergraduate
> Department of Computer Science and Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.
>

Follow ups

References