← Back to team overview

credativ team mailing list archive

[Merge] lp:~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex into lp:openupgrade-addons

 

Holger Brunn (Therp) has proposed merging lp:~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex into lp:openupgrade-addons.

Requested reviews:
  Sandy Carter (http://www.savoirfairelinux.com) (sandy-carter)

For more details, see:
https://code.launchpad.net/~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex/+merge/238872

This branch avoids using regexes in cases where backtracking can become so expensive that it seems like a deadlock, i.e. with

re.compile("'''''(([^']|([^']'{0,4}[^']))+)'''''").sub("<b><i>\1</i></b>", "'''''test'''' hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world")

Note that this occurs with all regexes of this kind in the case it's a malformed wiki syntax
-- 
https://code.launchpad.net/~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex/+merge/238872
Your team OpenUpgrade Committers is subscribed to branch lp:openupgrade-addons.
=== modified file 'document_page/migrations/7.0.1.0.1/post-migration.py'
--- document_page/migrations/7.0.1.0.1/post-migration.py	2014-09-15 07:43:23 +0000
+++ document_page/migrations/7.0.1.0.1/post-migration.py	2014-10-20 12:48:46 +0000
@@ -111,9 +111,6 @@
 re_ol_li = re.compile("^(\*\*+|#+):? ")
 re_ul_ol_li = re.compile("^(\*+|#+):? ")
 re_youtube = re.compile("^(https?://)?(www\.)?youtube.com/(watch\?(.*)v=|embed/)([^&]+)")
-re_b_i = re.compile("'''''(([^']|([^']('{1,4})?[^']))+)'''''")
-re_b = re.compile("'''(([^']|([^'](''?)?[^']))+)'''")
-re_i = re.compile("''(([^']|([^']'?[^']))+)''")
 
 
 class Wiky:
@@ -316,9 +313,36 @@
                     break
 
         # Bold, Italics, Emphasis
-        wikitext = re_b_i.sub("<b><i>\1</i></b>", wikitext)
-        wikitext = re_b.sub("<b>\1</b>", wikitext)
-        wikitext = re_i.sub("<i>\1</i>", wikitext)
-
-        return wikitext
+        head = ''
+        tail = wikitext
+        while tail:
+            quote_pos = tail.find("'")
+            if quote_pos > -1:
+                head += tail[:quote_pos]
+                tail = tail[quote_pos:]
+                pos = 0
+                while pos < len(tail) and tail[pos] == "'":
+                    pos += 1
+                if pos == 2 or pos == 3 or pos == 5:
+                    endquote_pos = tail.find("'"*pos, pos)
+                    if endquote_pos > -1:
+                        text = tail[pos:endquote_pos]
+                        if pos == 2:
+                            text = '<i>%s</i>' % text
+                        elif pos == 3:
+                            text = '<b>%s</b>' % text
+                        elif pos == 5:
+                            text = '<b><i>%s</i></b>' % text
+                        head += text
+                        tail = tail[endquote_pos + pos:]
+                    else:
+                        head += tail[:pos]
+                        tail = tail[pos:]
+                else:
+                    head += tail[:pos]
+                    tail = tail[pos:]
+            else:
+                head += tail
+                tail = ''
+        return head
 # vim:expandtab:smartindent:tabstop=4:softtabstop=4:shiftwidth=4:


Follow ups