← Back to team overview

widelands-dev team mailing list archive

[Merge] lp:~widelands-dev/widelands-website/convert_wikiwords into lp:widelands-website

 

kaputtnik has proposed merging lp:~widelands-dev/widelands-website/convert_wikiwords into lp:widelands-website.

Requested reviews:
  Widelands Developers (widelands-dev)
Related bugs:
  Bug #1595294 in Widelands Website: "Get rid of wikiwordification"
  https://bugs.launchpad.net/widelands-website/+bug/1595294

For more details, see:
https://code.launchpad.net/~widelands-dev/widelands-website/convert_wikiwords/+merge/318286

Not a real merge proposal because this script has to be run once. So just a proposal for review.

This script converts old WikiWords into a new Syntax for internal wiki links. See https://code.launchpad.net/~widelands-dev/widelands-website/get_rid_of_wikiwords/+merge/318283

The script changes in all articles:

!SirVer -> SirVer
BarbariansPage -> [[ BarbariansPage ]]
[Main Page](/wiki/MainPage) -> [[ MainPage | Main Page ]]
[Main Page](../MainPage) -> [[ MainPage | Main Page ]]
[Main Page](../Main Page) -> [[ Main Page | Main Page ]]

The script checks if a found wikiword is an existing article, so other WikiWords like MingW, SirVer (Wikiwords without preceding exclamation mark) are not changed.

It also makes a new changeset if a change is made. Creating a changeset implies a notification to an article observer, and every observer will receive an email of the change. This means the script will take some time to run through all articles and sending the emails.

This script should only be run ONCE!! Run it within the activated virtual environment:

> ./manage.py shell
> execfile('path/to/convert_wikiwords.py')
-- 
Your team Widelands Developers is requested to review the proposed merge of lp:~widelands-dev/widelands-website/convert_wikiwords into lp:widelands-website.
=== added file 'convert_wikiwords.py'
--- convert_wikiwords.py	1970-01-01 00:00:00 +0000
+++ convert_wikiwords.py	2017-02-24 21:11:39 +0000
@@ -0,0 +1,144 @@
+from __future__ import print_function
+from wiki.models import Article
+from django.contrib.auth.models import User
+from markdown.inlinepatterns import LINK_RE
+import re
+import StringIO
+import sys
+
+# Set default encoding for print()
+reload(sys)
+sys.setdefaultencoding('utf8')
+
+# Following re catches:
+# WikiWord
+# !SirVer
+# and a slash before WikiWord as used in links:
+# [Foo Bar](wiki/WikiWord)
+WIKIWORD_RE = re.compile(
+    r"(?P<presign>!?|\/?)(?P<wikiword>\b[A-Z][a-z]+[A-Z]\w+\b)")
+
+# Following re catches text which is in
+# markdown link syntax, catches NOT Syntax for images
+LINKS_RE = re.compile(LINK_RE)
+
+# The LINKS_RE catches also links which points to other sections as the wiki.
+# If a link contains a section from this list, it will not be changed.
+DONT_CHANGE_INTERN_LINKS = ['forum', 'maps', 'profile', 'news', 'poll', 'docs',
+                            'webchat', 'wlmedia', 'messages', 'accounts', 'profile', 'notification']
+
+# Shortcut
+ALL_ARTICLES = Article.objects.all().order_by('title')
+
+# Set the value to None for print to stdout instead into a file
+print_to_file = StringIO.StringIO()
+
+
+def is_wiki_link(path):
+    # Is this a link to wiki or not?
+    for section in DONT_CHANGE_INTERN_LINKS:
+        if section in path:
+            return False
+    return True
+
+
+def extract_wikilink(path):
+    # Remove preceding characters
+    # leave trailing characters so GameHelp/#header will stay
+    if 'wiki' in path:
+        return path.rpartition('wiki/')[2]
+    if '../' in path:
+        return path.rpartition('../')[2]
+    return path
+
+
+def replace_links(matchobj):
+    # Turn markdown syntax [Link text](/wiki/WikiWord) into
+    # [[ WikiWord | Link text ]]
+    link_path = matchobj.group(8)
+    if link_path.startswith('http'):
+        # Found an external link
+        if 'widelands.org/wiki/' in link_path:
+            # Test for a link to our own wiki, which uses full URI
+            # yes, there are some of them :-)
+            pass
+        else:
+            # This is really a link outside widelands.org
+            return matchobj.group(0)
+
+    if is_wiki_link(link_path):
+        print('Changed link: ', link_path, '->',
+              extract_wikilink(link_path), file=print_to_file)
+        return '[[ ' + extract_wikilink(link_path) + ' | ' + matchobj.group(1) + ' ]]'
+    else:
+        return matchobj.group(0)
+
+
+def replace_wikiwords(matchobj):
+    if matchobj.group('presign') == '!':
+        # Cut the presign so that !SirVer is returned as SirVer
+        print("Changed !WikiWord: ", matchobj.group(0), '->', matchobj.group(
+            'wikiword'), file=print_to_file)
+        return matchobj.group('wikiword')
+
+    if matchobj.group('presign') == '/':
+        # This WikiWord must be inside a link, return it with presign so that
+        # [Game Help](/wiki/GameHelp) stays intact. It will be changed in the
+        # replacements of links.
+        return matchobj.group(0)
+
+    # Check if article exists -> preventing changes of e.g.
+    # 'MingW', 'SourceForge' and similar
+    if ALL_ARTICLES.filter(title=matchobj.group('wikiword')).exists():
+        # Return the link in form of new syntax
+        print('Changed Wikiword: ', matchobj.group(0), '->', matchobj.group(
+            'wikiword'), file=print_to_file)
+        return '[[ ' + matchobj.group('wikiword') + ' ]]'
+
+    return matchobj.group(0)
+
+
+def save_with_changeset(article, new_content):
+    # Old contents of the article for the new changeset:
+    old_content = article.content
+    old_title = article.title
+    old_markup = article.markup
+
+    article.content = new_content
+    article.save()
+
+    # Create a changeset:
+    comment = 'Autochange: Convert wikilinks to new syntax'
+    editor_ip = '192.168.255.255'  # Same bogus IP as in wl_utils.get_real_ip()
+    # Create a user called 'wl_bot' if he is not not already there
+    # The user created here has no password set. Login with this username
+    # is not possible, a password could be set over the admin page
+    editor, created = User.objects.get_or_create(username='wl_bot')
+    changeset = article.new_revision(
+        old_content, old_title, old_markup,
+        comment, editor_ip, editor)
+
+
+def main():
+    articles = ALL_ARTICLES
+    # articles = Article.objects.filter(title='Wiki Todo')  # For testing
+    for article in articles:
+        print('\nSearching in article: ', article)
+        print('\nSearching in article: ', article, file=print_to_file)
+        text = article.content
+        new_text = WIKIWORD_RE.sub(replace_wikiwords, text)
+        new_text = LINKS_RE.sub(replace_links, new_text)
+        if new_text != text:
+            print('New Text of article:', article, new_text, file=print_to_file)
+            print('\n========================', file=print_to_file)
+            # Apply the changes
+            # Comment for testing
+            save_with_changeset(article, new_text)
+
+    if print_to_file:
+        with open('convert_result.txt', 'w') as f:
+            f.write(print_to_file.getvalue())
+        print_to_file.close()
+
+if __name__ == '__main__':
+    main()


Follow ups