← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~ceejatec/sfbugs2launchpad/user-mapping into lp:sfbugs2launchpad

 

You have been requested to review the proposed merge of lp:~ceejatec/sfbugs2launchpad/user-mapping into lp:sfbugs2launchpad.

For more details, see:
https://code.launchpad.net/~ceejatec/sfbugs2launchpad/user-mapping/+merge/78076

Added several features and upgrades (see commit message) along with full process documentation.

-- 
https://code.launchpad.net/~ceejatec/sfbugs2launchpad/user-mapping/+merge/78076
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~ceejatec/sfbugs2launchpad/user-mapping into lp:sfbugs2launchpad.
=== added file 'README.txt'
--- README.txt	1970-01-01 00:00:00 +0000
+++ README.txt	2011-10-04 10:43:34 +0000
@@ -0,0 +1,116 @@
+INSTRUCTIONS
+------------
+
+To import your bugs from a Sourceforge project to Launchpad:
+
+1. Create a dump of your Sourceforge project by going to
+   Project Admin > Features, and click on "XML Export" in the
+   "Options" column for the "Backups" feature.
+
+2. Create a sf-user.xml mapping file (optional; see below) to
+   map Sourceforge users to Launchpad accounts.
+
+3. Run:
+
+   convert_sf_bugs.py [-u sf-user.xml] project-dump.xml
+
+   (If you want all Sourceforge "Feature Requests" to be listed as
+   "Wishlist" Importance on Launchpad, specify
+   --wishlist-feature-requests.)
+
+   This will create a file named "output.xml".
+
+4. Verify that output.xml matches the Relax-NG schema for Launchpad
+   bug imports, located in ng/bug-export.rnc. You can use the "rnv"
+   tool to do this:
+
+     rnv ng/bug-export.rnc output.xml
+
+   If this reports any errors, your import will not work.
+
+   Be sure to check
+
+      https://help.launchpad.net/Bugs/ImportFormat
+
+   to check if there is a newer version of this schema.
+
+5. Host this file somewhere on the internet so Launchpad engineering
+   can download it.
+
+6. Go to
+
+      https://answers.launchpad.net/launchpad/
+
+   and click "Ask a question" in the upper-right corner.
+
+7. Fill out the Question form, requesting an import into your project.
+   Specify the project name on Launchpad (must already be created)
+   as well as the URL for your XML dump from step 3.
+
+8. Soon (usually within 48 hours) Launchpad Engineering will perform
+   an import to the Launchpad staging site:
+
+      https://staging.launchpad.net/
+
+   Go to your project page there and investigate the imported
+   bugs. Note that the staging site is run on very limited hardware,
+   so you will encounter various timeout errors and so forth. Usually
+   reloading the page will help. Also note that the database for the
+   staging site is updated from the production Launchpad every week or
+   so, so your project entries will possibly look a little out of date
+   on the staging site. Also, your imported bugs on staging will
+   disappear in no more than a week.
+
+9. Assuming all is well, post a message on your Question requesting an
+   import to the production Launchpad!
+
+
+
+SOURCEFORCE->LAUNCHPAD USER MAPPING FILE
+----------------------------------------
+
+The import process run by Launchpad associates people (assignees,
+reporters, and commenters) to a Launchpad account by the email address
+specified in the imported XML file. By default, the email address in
+there will be "username@xxxxxxxxxxxxxxxxxxxxx", which is almost
+certainly not the correct email address for any Launchpad user. So by
+default, all assignees, reporters, and commenters will be associated
+with empty Launchpad accounts.
+
+To avoid this, you can create a simple XML file which maps Sourceforge
+user names to valid email addresses for Launchpad accounts. (Obviously
+you can only do this for people who have already created a Launchpad
+account, so encourage all your team members to do so first and tell
+you their registered email address.)
+
+The XML file is described by the Relax-NG schema located in
+ng/sf-user-map.rnc, but a brief example should be sufficient:
+
+<?xml version="1.0" encoding="UTF-8"?>
+<projectmembers>
+  <projectmember>
+    <full_name>Homer J. Simpson</full_name>
+    <sf_user_name>donut_lover</sf_user_name>
+    <email>doh@xxxxxxxxxxx</email>
+  </projectmember>
+  <projectmember>
+    <full_name>Waylon Smithers</full_name>
+    <sf_user_name>loveburns</sf_user_name>
+    <email>yessir@xxxxxxxxxxx</email>
+  </projectmember>
+</projectmembers>
+
+Just remember that the sf_user_name element declares the Sourceforge
+user name, but the email element must declare the email address that
+is registered for that user on Launchpad.
+
+Pass this filename to the -u (or --user-mapping) argument to
+convert_sf_bugs.py.
+
+Any Sourceforge users that are mentioned in your project dump that are
+not listed in your user-mapping file will be provided with a default
+email address as described above. That will create "phantom" accounts
+on Launchpad. After the import is done, if those users later create a
+Launchpad account, they may merge those phantom accounts with their
+own by visiting the Launchpad home page of that account and clicking
+you on the "Are you Marge Simpson?" link.

=== modified file 'convert_sf_bugs.py'
--- convert_sf_bugs.py	2010-04-13 15:09:22 +0000
+++ convert_sf_bugs.py	2011-10-04 10:43:34 +0000
@@ -14,6 +14,7 @@
 import urllib2, socket
 import base64
 import time
+import re
 
 """
 This script parses a projects XML data as exported by sourceforge
@@ -22,18 +23,45 @@
 """
 
 STATUS_MAP = {
-    "Closed": "FIXRELEASED",
     "Deleted": "WONTFIX",
-    "Open": "NEW",
     "Pending": "INCOMPLETE",
+    "Open": {
+        "Accepted": "CONFIRMED",
+        "Duplicate": "INVALID",
+        "Fixed": "FIXCOMMITTED",
+        "Invalid": "INVALID",
+        "Later": "INCOMPLETE",
+        "None": "NEW",
+        "Out of Date": "INCOMPLETE",
+        "Postponed": "INCOMPLETE",
+        "Rejected": "WONTFIX",
+        "Remind": "INCOMPLETE",
+        "Wont Fix": "WONTFIX",
+        "Works For Me": "INVALID"
+    },
+    "Closed": {
+        "Accepted": "FIXRELEASED",
+        "Duplicate": "INVALID",
+        "Fixed": "FIXRELEASED",
+        "Invalid": "INVALID",
+        "Later": "WONTFIX",
+        "None": "UNKNOWN",
+        "Out of Date": "WONTFIX",
+        "Postponed": "WONTFIX",
+        "Rejected": "WONTFIX",
+        "Remind": "WONTFIX",
+        "Wont Fix": "WONTFIX",
+        "Works For Me": "INVALID",
+    },
 }
 
+
 IMPORTANCE_MAP = {
     "9": "CRITICAL",
     "8": "HIGH",
     "7": "HIGH",
     "6": "MEDIUM",
-    "5": "UNDECIDED",
+    "5": "MEDIUM",
     "4": "MEDIUM",
     "3": "LOW",
     "2": "LOW",
@@ -85,6 +113,41 @@
             d[n.find(idname).text] = n.find(valuename).text
         self[name] = d
 
+def parse_users(sf_user_nodes, usermap):
+    """
+    Parse all sf.net users and return a dictionary. Keys are sf user
+    names.  Values are dictionaries with "email", "sf_user_name", and
+    "full_name".
+    """
+    users = {}
+
+    # First populate from usermap (if provided)
+    count = 0
+    if (usermap):
+        for user in usermap:
+            user_dict = NodeDict()
+            user_dict.from_node("email", user)
+            user_dict.from_node("full_name", user)
+            user_dict.from_node("sf_user_name", user)
+
+            users[user_dict.sf_user_name] = user_dict
+            count += 1
+        print "Read %d users from user-map file" % count
+
+    # Now extend with any SF users that aren't in the usermap
+    count = 0
+    for sf_user in sf_user_nodes:
+        sf_user_dict = NodeDict()
+        sf_user_dict.from_node("sf_user_name", sf_user, "user_name")
+        if (sf_user_dict.sf_user_name not in users):
+            sf_user_dict.from_node("email", sf_user)
+            sf_user_dict.from_node("full_name", sf_user, "public_name")
+            users[sf_user_dict.sf_user_name] = sf_user_dict
+            count += 1
+    print "Read %d users from Sourceforge project dump" % count
+
+    return users
+
 def parse_tracker_items(item_nodes, tracker):
     parsed_items = []
 
@@ -118,6 +181,10 @@
             "group", item, "group_id", tracker.groups.get)
         item_dict.try_from_node(
             "resolution", item, "resolution_id", tracker.resolutions.get)
+        # Ensure all tracker items have a resolution, even if the tracker
+        # doesn't have the "Resolution" field
+        if ("resolution" not in item_dict):
+            item_dict["resolution"] = "None"
 
         # Parse follow ups
         item_dict["followups"] = []
@@ -183,9 +250,22 @@
         tracker.parse_lookup_dict(
             tracker_node, "statuses", "status", "id", "name")
 
+        # "100" is the default (unset) value for category_id and group_id
         tracker.categories["100"] = "None"
         tracker.groups["100"] = "None"
 
+        # Transform all category names and group names into tag-friendly form
+        def _tagify(adict):
+            for category in adict:
+                catname = adict[category]
+                # Special case
+                catname = catname.replace('C++', 'cplusplus')
+                catname = re.sub(r'[^A-Za-z0-9.-]+', '-', catname.lower())
+                catname = catname.strip('-')
+                adict[category] = catname
+        _tagify(tracker.categories)
+        _tagify(tracker.groups)
+
         issues.extend(
             parse_tracker_items(tracker_node.find("tracker_items"), tracker))
     return issues
@@ -224,14 +304,29 @@
     return date.isoformat() + 'Z'
 
 
-def create_launchpad_issue_tree(issues, options):
+def create_launchpad_issue_tree(issues, options, users):
     root = ET.Element("launchpad-bugs")
     root.attrib['xmlns'] = "https://launchpad.net/xmlns/2006/bugs";
 
-    def _sfuser(elem, name):
-        name = (name if name is not None else "noone").strip()
-        elem.text = name
-        elem.attrib["email"] = "%s@xxxxxxxxxxxx" % elem.text
+    def _mapsfuser(elem, sfusername):
+        """
+        Populates the given element as a Launchpad "person", with the
+        full name as the text along with "email" and (optionally)
+        "name" attributes, where "name" is the Launchpad username. If
+        the sfusername is found in the "users" map, the values will
+        come from there; otherwise we'll make something up.
+        """
+        if (sfusername in users):
+            userdata = users[sfusername]
+            elem.text = userdata.full_name
+            elem.attrib["email"] = userdata.email
+            if ("lp_user_name" in userdata):
+                elem.attrib["name"] = userdata.lp_user_name
+        else:
+            # We have no source of information about this Sourceforge username
+            name = (sfusername if sfusername is not None else "noone").strip()
+            elem.text = name
+            elem.attrib["email"] = "%s@xxxxxxxxxxxx" % elem.text
 
     for idx,issue in enumerate(issues):
         print "Handling issue %i/%i (%s) ..." % (idx+1, len(issues), issue.id)
@@ -248,9 +343,15 @@
         ET.SubElement(bug, "title").text = issue.summary
         ET.SubElement(bug, "description").text = issue.details
 
-        _sfuser(ET.SubElement(bug, "reporter"), issue.submitter)
+        _mapsfuser(ET.SubElement(bug, "reporter"), issue.submitter)
+        _mapsfuser(ET.SubElement(bug, "assignee"), issue.assignee)
 
-        ET.SubElement(bug, "status").text = STATUS_MAP[issue.status]
+        c = ET.SubElement(bug, "status")
+        status = STATUS_MAP[issue.status]
+        try:
+            c.text = status[issue.resolution]
+        except TypeError:
+            c.text = status
 
         if (options.wishlist_feature_requests and
             issue.node_type == NodeTypes.FEATURE_REQUEST):
@@ -261,24 +362,33 @@
                 IMPORTANCE_MAP[issue.priority])
 
         # TODO: Not handled: milestone
-        # TODO: Not handled: assignee
-        # TODO: Not handled: urls
         # TODO: Not handled: cves
         # TODO: Not handled: bugwatches
-        # TODO: Not handled: assignee
-
-        # REALLYTODO: Not handled: tags
+
+        # Add a URL pointing to the original Sourceforge bug
+        c = ET.SubElement(bug, "urls")
+        c = ET.SubElement(c, "url")
+        c.text = "Sourceforge bug #%s" % issue.id
+        c.attrib["href"] = issue.url
+
+        # Add tags based on the group and category
+        c = ET.SubElement(bug, "tags")
+        def _addtag(name):
+            if (name != "none"):
+                ET.SubElement(c, "tag").text = name
+        _addtag(issue.category)
+        _addtag(issue.group)
 
         # Add the obligatory first comment
         c = ET.SubElement(bug, "comment")
-        _sfuser(ET.SubElement(c, "sender"), issue.submitter)
+        _mapsfuser(ET.SubElement(c, "sender"), issue.submitter)
         ET.SubElement(c, "date").text = format_date(issue.submit_date)
         ET.SubElement(c, "title").text = issue.summary
         ET.SubElement(c, "text").text = issue.details
 
         for follow in issue.followups:
             c = ET.SubElement(bug, "comment")
-            _sfuser(ET.SubElement(c, "sender"), follow.submitter)
+            _mapsfuser(ET.SubElement(c, "sender"), follow.submitter)
             ET.SubElement(c, "date").text = format_date(follow.date)
             ET.SubElement(c, "title").text = "RE: " + issue.summary
             ET.SubElement(c, "text").text = follow.details
@@ -286,12 +396,12 @@
         # Add attached files
         for aidx, attach in enumerate(issue.attachments):
             c = ET.SubElement(bug, "comment")
-            _sfuser(ET.SubElement(c, "sender"), attach.submitter)
+            _mapsfuser(ET.SubElement(c, "sender"), attach.submitter)
             ET.SubElement(c, "date").text = format_date(attach.date)
             ET.SubElement(c, "title").text = "RE: " + issue.summary
             ET.SubElement(c, "text").text = "The file %s was added: %s" % (
                 attach.filename, attach.description)
-            print "  Downloading Attachement %i/%i..." % (
+            print "  Downloading Attachment %i/%i..." % (
                 aidx+1,len(issue.attachments)),
             data = download_url(attach.url + issue.id)
             print("%i bytes" % len(data))
@@ -308,8 +418,7 @@
             ET.SubElement(at, "mimetype").text = attach.filetype
             ET.SubElement(at, "contents").text = data
 
-
-    # indent(root)
+    indent(root)
     et = ET.ElementTree(root)
 
     return et
@@ -317,21 +426,37 @@
 
 def main():
     option_parser = optparse.OptionParser(
-        "%prog <sf.net xml project export file>")
+        "%prog [options] <sf.net xml project export file>")
 
     option_parser.add_option(
         '--wishlist-feature-requests', action='store_true', default=False,
         help='give imported feature requests the priority "WISHLIST"')
 
+    option_parser.add_option(
+        '--user-mapping', '-u', action='store', type='string',
+        dest='user_map_file',
+        help="provide mapping from sf.net usernames to Launchpad user info")
+
     options, args = option_parser.parse_args()
     if len(args) < 1:
-        option_parser.print_usage()
+        option_parser.print_help()
         return -1
 
-    issues = parse_issues(ET.parse(args[0]).getroot().find("trackers"))
+    # Read in entire sf.net project file
+    sfdump = ET.parse(args[0]).getroot()
+
+    # Read in user-mapping file, if specified
+    usermap = None
+    if (options.user_map_file):
+        usermap = ET.parse(options.user_map_file).getroot()
+
+    # Create user lookup table from project dump and mapping file
+    users = parse_users(sfdump.find("projectsummary/projectmembers"), usermap)
+
+    issues = parse_issues(sfdump.find("trackers"))
     print "Parsed %i issues..." % len(issues)
 
-    et = create_launchpad_issue_tree(issues, options)
+    et = create_launchpad_issue_tree(issues, options, users)
 
     et.write("output.xml", encoding="utf-8")
 

=== added directory 'ng'
=== added file 'ng/bug-export.rnc'
--- ng/bug-export.rnc	1970-01-01 00:00:00 +0000
+++ ng/bug-export.rnc	2011-10-04 10:43:34 +0000
@@ -0,0 +1,97 @@
+default namespace = "https://launchpad.net/xmlns/2006/bugs";
+
+start = lpbugs
+
+# Data types
+
+boolean = "True" | "False"
+lpname = xsd:string { pattern = "[a-z0-9][a-z0-9\+\.\-]*" }
+cvename = xsd:string { pattern = "(19|20)[0-9][0-9]-[0-9][0-9][0-9][0-9]" }
+
+# XXX: jamesh 2006-04-11 bug=105401:
+# These status and importance values need to be kept in sync with the
+# rest of Launchpad.  However, there are not yet any tests for this.
+#     https://bugs.launchpad.net/bugs/105401
+status = (
+  "NEW"          |
+  "INCOMPLETE"   |
+  "INVALID"      |
+  "WONTFIX"      |
+  "CONFIRMED"    |
+  "TRIAGED"      |
+  "INPROGRESS"   |
+  "FIXCOMMITTED" |
+  "FIXRELEASED"  |
+  "UNKNOWN")
+importance = (
+  "UNKNOWN"   |
+  "CRITICAL"  |
+  "HIGH"      |
+  "MEDIUM"    |
+  "LOW"       |
+  "WISHLIST"  |
+  "UNDECIDED")
+
+# Content model for a person element.  The element content is the
+# person's name.  For successful bug import, an email address must be
+# provided.
+person = (
+  attribute name { lpname }?,
+  attribute email { text }?,
+  text)
+
+lpbugs = element launchpad-bugs { bug* }
+
+bug = element bug {
+  attribute id { xsd:integer } &
+  element private { boolean }? &
+  element security_related { boolean }? &
+  element duplicateof { xsd:integer }? &
+  element datecreated { xsd:dateTime } &
+  element nickname { lpname }? &
+  # The following will likely be renamed summary in a future version.
+  element title { text } &
+  element description { text } &
+  element reporter { person } &
+  element status { status } &
+  element importance { importance } &
+  element milestone { lpname }? &
+  element assignee { person }? &
+  element urls {
+    element url { attribute href { xsd:anyURI }, text }*
+  }? &
+  element cves {
+    element cve { cvename }*
+  }? &
+  element tags {
+    element tag { lpname }*
+  }? &
+  element bugwatches {
+    element bugwatch { attribute href { xsd:anyURI } }*
+  }? &
+  element subscriptions {
+    element subscriber { person }*
+  }? &
+  comment+
+}
+
+# A bug has one or more comments.  The first comment duplicates the
+# reporter, datecreated, title, description of the bug.
+comment = element comment {
+  element sender { person } &
+  element date { xsd:dateTime } &
+  element title { text }? &
+  element text { text } &
+  attachment*
+}
+
+# A bug attachment.  Attachments are associated with a bug comment.
+attachment = element attachment {
+  attribute href { xsd:anyURI }? &
+  element type { "PATCH" | "UNSPECIFIED" }? &
+  element filename { text }? &
+  # The following will likely be renamed summary in a future version.
+  element title { text }? &
+  element mimetype { text }? &
+  element contents { xsd:base64Binary }
+}

=== added file 'ng/sf-user-map.rnc'
--- ng/sf-user-map.rnc	1970-01-01 00:00:00 +0000
+++ ng/sf-user-map.rnc	2011-10-04 10:43:34 +0000
@@ -0,0 +1,11 @@
+default namespace = ""
+
+start = projectmembers
+
+projectmembers = element projectmembers { projectmember* }
+
+projectmember = element projectmember {
+  element full_name { text } &
+  element sf_user_name { text } &
+  element email { text }
+}