launchpad-reviewers team mailing list archive

Thread
Date

[Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad

To: mp+243751@xxxxxxxxxxxxxxxxxx,William Grant <me@xxxxxxxxxxxxxxxxxx>
From: Bayard 'kit' Randel <kit.randel@xxxxxxxxxxxxx>
Date: Fri, 05 Dec 2014 02:11:25 -0000
Reply-to: mp+243751@xxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Bayard 'kit' Randel has proposed merging lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad.

Commit message:
Check for inline comments associated with patch headers, hunk headers (hunk context lines) and hunk lines, and discard irrelevant bits in branch merge proposal mail.

Requested reviews:
William Grant (wgrant): code
Related bugs:
Bug #1334577 in Launchpad itself: "mails reporting inline diff comments are too verbose"
https://bugs.launchpad.net/launchpad/+bug/1334577

For more details, see:
https://code.launchpad.net/~blr/launchpad/bug-1334577-verbose-diff/+merge/243751

Summary
Emails notifications of inline comments on a merge proposal are currently sent with the entirety of the diff, which is needlessly verbose and can make it difficult to find the relevant code. This branch provides a re-implementation of the code responsible for this and discards patches and diff hunks that do not have associated inline comments.

Proposed fix
Check for comments associated with patch headers, hunk headers (hunk context lines) and hunk lines, and discard irrelevant bits.

Implementation details
Currently the diff text is represented as a simple list. In order to have a structural representation of the diff, parse the diff text with bzrlib.patches. Ideally bzrlib.patches would provide line numbers, but as it does not wgrant and blr decided a less intrusive approach would be better than submitting a MP for bzrlib, even if it makes the mailer code more verbose.

LOC Rationale
More lines of code here to reduce lines of code elsewhere (email!). I think we're breaking even.

Tests
./bin/test -c -m lp.code.mail.tests.test_codereviewcomment

Demo and Q/A
Running the tests locally is the best option, unless you have an MTA configured in your dev environment. I'm hoping we can test this with genuine emails on staging?

Lint
Lint free
--
Your team Launchpad code reviewers is subscribed to branch lp:launchpad.

=== modified file 'lib/lp/code/mail/codereviewcomment.py'
--- lib/lp/code/mail/codereviewcomment.py	2014-05-21 08:52:53 +0000
+++ lib/lp/code/mail/codereviewcomment.py	2014-12-05 02:10:28 +0000
@@ -10,6 +10,7 @@
     'CodeReviewCommentMailer',
     ]
 
+from bzrlib import patches
 
 from zope.component import getUtility
 from zope.security.proxy import removeSecurityProxy
@@ -169,18 +170,90 @@
                 content, content_type=content_type, filename=filename)
 
 
+def comment_in_hunk(hunk, comments, line_count):
+    """ Check if comment exists in hunk lines. """
+
+    # check comment in context line
+    comment = comments.get(str(line_count))
+    if comment is not None:
+        return True
+
+    # check comment in hunk lines
+    for line in hunk.lines:
+        line_count = line_count + 1
+        comment = comments.get(str(line_count))
+        if comment is not None:
+            return True
+    return False
+
+
+def format_comment(comment):
+    """ Returns a list of correctly formatted comment(s). """
+    comment_lines = []
+    if comment is not None:
+        comment_lines.append('')
+        for comment in comment.splitlines():
+            comment_lines.append(u'%s' % comment.decode('utf-8', 'replace'))
+        comment_lines.append('')
+    return comment_lines
+
+
+def format_patch_header(patch):
+    """ Returns a list of correctly formmated patch headers. """
+    patch_header_lines = []
+    for p in patch.get_header().splitlines():
+        patch_header_lines.append('> {0}'.format(p))
+    return patch_header_lines
+
+
 def build_inline_comments_section(comments, diff_text):
-    """Return a formatted text section with contextualized comments."""
+    """ Return a formatted text section with contextualized comments.
+
+    Hunks without comments are skipped to limit verbosity.
+    Comments can be rendered after patch headers, hunk context lines,
+    and hunk lines.
+    """
+    diff_lines = diff_text.splitlines(True)
+    # allow_dirty() will preserve text not conforming to unified diff
+    diff_patches = patches.parse_patches(diff_lines, allow_dirty=True)
     result_lines = []
-    diff_lines = diff_text.splitlines()
-    for num, line in enumerate(diff_lines, 1):
-        result_lines.append(u'> {0}'.format(line.decode('utf-8', 'replace')))
-        comment = comments.get(str(num))
+    line_count = 0
+
+    for patch in diff_patches:
+        header_set = False
+        line_count = line_count + 2  # inc patch headers
+        comment = comments.get(str(line_count))
+
         if comment is not None:
-            result_lines.append('')
-            result_lines.extend(comment.splitlines())
-            result_lines.append('')
-
-    result_text = u'\n'.join(result_lines)
-
+            # comment in patch header, add header and comment
+            result_lines.extend(format_patch_header(patch))
+            result_lines.extend(format_comment(comment))
+            header_set = True
+
+        for hunk in patch.hunks:
+            line_count = line_count + 1  # inc hunk context line
+
+            if comment_in_hunk(hunk, comments, line_count):
+                if not header_set:
+                    result_lines.extend(format_patch_header(patch))
+                    header_set = True
+
+                # add context line (hunk header)
+                result_lines.append(u'> %s' % hunk.get_header().rstrip('\n'))
+
+                # comment for context line (hunk header)
+                comment = comments.get(str(line_count))
+                if comment is not None:
+                    result_lines.extend(format_comment(comment))
+
+                for line in hunk.lines:
+                    line_count = line_count + 1  # inc hunk lines
+                    result_lines.append(u'> %s' % str(
+                        line).rstrip('\n').decode('utf-8', 'replace'))
+                    comment = comments.get(str(line_count))
+                    result_lines.extend(format_comment(comment))
+            else:
+                line_count = line_count + len(hunk.lines)  # inc hunk lines
+
+    result_text = '\n'.join(result_lines).encode('utf-8')
     return '\n\nDiff comments:\n\n%s\n\n' % result_text

=== modified file 'lib/lp/code/mail/tests/test_codereviewcomment.py'
--- lib/lp/code/mail/tests/test_codereviewcomment.py	2014-08-19 04:56:39 +0000
+++ lib/lp/code/mail/tests/test_codereviewcomment.py	2014-12-05 02:10:28 +0000
@@ -239,7 +239,7 @@
         See `build_inline_comments_section` tests for formatting details.
         """
         comment = self.makeCommentWithInlineComments(
-            inline_comments={'2': u'Is this from Planet Earth\xa9 ?'})
+            inline_comments={'2': 'Is this from Pl\u0060net Earth ?'})
         mailer = CodeReviewCommentMailer.forCreation(comment)
         commenter = comment.branch_merge_proposal.registrant
         ctrl = mailer.generateEmail(
@@ -249,16 +249,15 @@
             '',
             'Diff comments:',
             '',
-            "> === zbqvsvrq svyr 'yvo/yc/pbqr/vagresnprf/qvss.cl'",
-            '> --- yvo/yc/pbqr/vagresnprf/qvss.cl      '
-            '2009-10-01 13:25:12 +0000',
-            '',
-            u'Is this from Planet Earth\xa9 ?',
-            '',
-            '> +++ yvo/yc/pbqr/vagresnprf/qvss.cl      '
-            '2010-02-02 15:48:56 +0000'
+            ('> --- yvo/yc/pbqr/vagresnprf/qvss.cl      '
+             '2009-10-01 13:25:12 +0000'),
+            ('> +++ yvo/yc/pbqr/vagresnprf/qvss.cl      '
+             '2010-02-02 15:48:56 +0000'),
+            '',
+            'Is this from Pl\u0060net Earth ?',
+            '',
         ]
-        self.assertEqual(expected_lines, ctrl.body.splitlines()[1:10])
+        self.assertEqual(expected_lines, ctrl.body.splitlines()[1:9])
 
     def makeComment(self, email_message):
         message = getUtility(IMessageSet).fromEmail(email_message.as_string())
@@ -389,6 +388,9 @@
         "@@ -0,0 +1,2 @@\n"
         "+a\n"
         "+b\n"
+        "@@ -1,2 +0,0 @@\n"
+        "-x\n"
+        "-y\n"
         "--- foo\t2009-08-26 15:53:23.000000000 -0400\n"
         "+++ foo\t2009-08-26 15:56:43.000000000 -0400\n"
         "@@ -1,3 +1,4 @@\n"
@@ -408,44 +410,64 @@
         # The inline comments section starts with a 4-lines header
         # (empty lines and title) and ends with an empty line.
         section = self.getSection({}).splitlines()
-        header = section[:5]
+        header = section[:4]
         self.assertEqual(
             ['',
              '',
              'Diff comments:',
-             '',
-             '> --- bar\t2009-08-26 15:53:34.000000000 -0400'],
-            header)
-        footer = section[-2:]
+             ''], header)
+        footer = section[-1:]
         self.assertEqual(
-            ['> +e',
-             ''],
+            [''],
             footer)
 
     def test_single_line_comment(self):
         # The inline comments are correctly contextualized in the diff.
         # and prefixed with '>>> '
-        comments = {'2': u'\u03b4\u03bf\u03ba\u03b9\u03bc\u03ae'}
-        self.assertEqual(
-            ['> +++ bar\t1969-12-31 19:00:00.000000000 -0500',
-             '',
-             u'\u03b4\u03bf\u03ba\u03b9\u03bc\u03ae',
-             '',
-             '> @@ -1,3 +0,0 @@',
-             u'> -\xe5'],
-            self.getSection(comments).splitlines()[5:11])
+        comments = {'2': '\u03b4\u03bf\u03ba\u03b9\u03bc\u03ae'}
+        self.assertEqual(
+            map(unicode, ['> +++ bar\t1969-12-31 19:00:00.000000000 -0500',
+             '',
+             '\u03b4\u03bf\u03ba\u03b9\u03bc\u03ae',
+             '']),
+            self.getSection(comments).splitlines()[5:9])
+
+    def test_commentless_hunks_ignored(self):
+        comments = {'14': 'A comment', '19': 'Another comment'}
+        self.assertEqual(
+            map(unicode, ['> --- baz\t1969-12-31 19:00:00.000000000 -0500',
+             '> +++ baz\t2009-08-26 15:53:57.000000000 -0400',
+             '> @@ -1,2 +0,0 @@',
+             '> -x',
+             '> -y',
+             '',
+             'A comment',
+             '',
+             '> --- foo\t2009-08-26 15:53:23.000000000 -0400',
+             '> +++ foo\t2009-08-26 15:56:43.000000000 -0400',
+             '> @@ -1,3 +1,4 @@',
+             '>  a',
+             '> -b',
+             '',
+             'Another comment',
+             '',
+             '>  c',
+             '> +d',
+             '> +e']),
+            self.getSection(comments).splitlines()[4:23])
 
     def test_multi_line_comment(self):
         # Inline comments with multiple lines are rendered appropriately.
         comments = {'2': 'Foo\nBar'}
         self.assertEqual(
-            ['> +++ bar\t1969-12-31 19:00:00.000000000 -0500',
+            map(unicode, ['> --- bar\t2009-08-26 15:53:34.000000000 -0400',
+             '> +++ bar\t1969-12-31 19:00:00.000000000 -0500',
              '',
              'Foo',
              'Bar',
              '',
-             '> @@ -1,3 +0,0 @@'],
-            self.getSection(comments).splitlines()[5:11])
+             '']),
+            self.getSection(comments).splitlines()[4:])
 
     def test_multiple_comments(self):
         # Multiple inline comments are redered appropriately.

Follow ups

[Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad
From: noreply, 2015-07-01
Re: [Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad
From: Colin Watson, 2015-06-29
Re: [Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad
From: Bayard 'kit' Randel, 2015-06-29
Re: [Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad
From: Colin Watson, 2015-06-29
Re: [Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad
From: Colin Watson, 2015-06-23
Re: [Merge] lp:~blr/launchpad/bug-1334577-verbose-diff into lp:launchpad
From: William Grant, 2014-12-05