← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~abentley/launchpad/no-dedup into lp:launchpad

 

Aaron Bentley has proposed merging lp:~abentley/launchpad/no-dedup into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #595166 in Launchpad itself: "IntegrityError raised filing a bug using the email interface"
  https://bugs.launchpad.net/launchpad/+bug/595166

For more details, see:
https://code.launchpad.net/~abentley/launchpad/no-dedup/+merge/63044

= Summary =
Fix bug #595166: IntegrityError raised filing a bug using the email interface

== Proposed fix ==
Stop detecting duplicates in MessageSet.fromEmail().

== Pre-implementation notes ==
Discussed with lifeless.

== Implementation details ==
Messages were only considered duplicates if their raw text matched.  This was a corner case, because two headers almost always change:
1. The Received header added by our MTA varies by time
2. The X-Launchpad-Original-To header added by our MTA varies by addressee

In these corner cases, fromEmail could return an existing message, and this would cause this oops.  So the simplest solution is to turn off the corner case.

== Tests ==
bin/test test_message -t fromEmail_always_creates -vv

== Demo and Q/A ==
Due to the race condition involved in trying to reproduce the Received header, this is effectively untestable.

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/services/messages/model/message.py
  lib/lp/services/messages/tests/test_message.py
-- 
https://code.launchpad.net/~abentley/launchpad/no-dedup/+merge/63044
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~abentley/launchpad/no-dedup into lp:launchpad.
=== modified file 'lib/lp/services/messages/model/message.py'
--- lib/lp/services/messages/model/message.py	2011-05-12 21:33:10 +0000
+++ lib/lp/services/messages/model/message.py	2011-05-31 21:06:47 +0000
@@ -146,7 +146,7 @@
         """See IMessage."""
         if self.title.lower().startswith('re: '):
             return self.title
-        return 'Re: '+self.title
+        return 'Re: ' + self.title
 
     @property
     def title(self):
@@ -201,7 +201,7 @@
         []
     """
     for name in ['In-Reply-To', 'References']:
-        if parsed_message.has_key(name):
+        if name in parsed_message:
             return parsed_message.get(name).split()
 
     return []
@@ -302,30 +302,9 @@
             raise InvalidEmailMessage('Msg %s size %d exceeds limit %d' % (
                 rfc822msgid, len(email_message), MAX_EMAIL_SIZE))
 
-        # Handle duplicate Message-Id
-        # XXX kiko 2005-08-03: shouldn't we be using DuplicateMessageId here?
-        try:
-            existing_msgs = self.get(rfc822msgid=rfc822msgid)
-        except LookupError:
-            pass
-        else:
-            # we are now allowing multiple msgs in the db with the same
-            # rfc822 msg-id to allow for variations in headers and,
-            # potentially, content. so we scan through the results to try
-            # and find one that matches,
-            for existing in existing_msgs:
-                existing_raw = existing.raw.read()
-                if email_message == existing_raw:
-                    return existing
-                # ok, this is an interesting situation. we have a new
-                # message with the same rfc822 msg-id as an existing message
-                # in the database, but the message headers and/or content
-                # are different. For the moment, we have chosen to allow
-                # this, but in future we may want to flag it in some way
-
         # Stuff a copy of the raw email into the Librarian, if it isn't
         # already in there.
-        file_alias_set = getUtility(ILibraryFileAliasSet) # Reused later
+        file_alias_set = getUtility(ILibraryFileAliasSet)  # Reused later
         if filealias is None:
             # We generate a filename to avoid people guessing the URL.
             # We don't want URLs to private bug messages to be guessable
@@ -536,6 +515,7 @@
     def threadMessages(klass, messages):
         """See `IMessageSet`."""
         result, roots = klass._parentToChild(messages)
+
         def get_children(node):
             children = []
             for child in result[node]:

=== modified file 'lib/lp/services/messages/tests/test_message.py'
--- lib/lp/services/messages/tests/test_message.py	2011-05-13 17:08:32 +0000
+++ lib/lp/services/messages/tests/test_message.py	2011-05-31 21:06:47 +0000
@@ -4,7 +4,6 @@
 __metaclass__ = type
 
 from cStringIO import StringIO
-from doctest import DocTestSuite
 from email.Message import Message
 from email.MIMEMultipart import MIMEMultipart
 from email.MIMEText import MIMEText
@@ -60,7 +59,7 @@
         expected = {
             message1: [message2, message3],
             message2: [message4],
-            message3: [], message4:[]}
+            message3: [], message4: []}
         result, roots = MessageSet._parentToChild(messages)
         self.assertEqual(expected, result)
         self.assertEqual([message1], roots)
@@ -141,6 +140,15 @@
         transaction.commit()
         self.assertEqual('This is the diff, honest.', diff.blob.read())
 
+    def test_fromEmail_always_creates(self):
+        """Even when messages are identical, fromEmail creates a new one."""
+        email = self.factory.makeEmailMessage()
+        orig_message = MessageSet().fromEmail(email.as_string())
+        # update librarian
+        transaction.commit()
+        dupe_message = MessageSet().fromEmail(email.as_string())
+        self.assertNotEqual(orig_message.id, dupe_message.id)
+
 
 class TestMessageJob(TestCaseWithFactory):
     """Tests for MessageJob."""