← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jtv/launchpad/recife-import-fixups into lp:~launchpad/launchpad/recife

 

Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/recife-import-fixups into lp:~launchpad/launchpad/recife.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers): code


= Import fixes for Recife =

This moves a few holdouts in the the translations import system over to the new Recife data model.

The change fixes two bugs:

1. The "super-fast imports" cache would stop certain updates from happening, thinking that the data was already in the right state.  This cache was fairly complex because of the old dual-layer current/published data model.  It's a lot simpler now since it only has to worry about the current translations in the POFile it's looking at.

2. The final phase of a translation import involved marking all messages that weren't seen in the imported file as no longer being the published messages.  As far as I can make out that was actually a mistake.  The unseen messages should lose their "published" status only if the import itself was marked as a published one.  But now we no longer have published imports; instead we have upstream imports versus ubuntu imports and, message sharing aside, each stands on its own.

The changes to one of the doctests were a bit hairy.  It's possible that the present state of the test made sense only while transitioning to the new model, but not in a fully implemented new model.  For instance, it imports some messages as a file; three of those messages are evidently correct, one fails validation, and one is marked as fuzzy.  But the import somehow resulted in only 2 messages being current.  That's a more understandable 3 now.  Updating this is still somewhat senseless in that the statistics haven't been updated to follow the new model, and so can't possibly make complete sense.  We'll have to investigate this when we fix up the statistics (and no doubt break this test further).

To test, better run anything in lp.translations with "import" in it:
{{{
./bin/test -vvc lp.translations -t import
}}}


No lint,

Jeroen
-- 
https://code.launchpad.net/~jtv/launchpad/recife-import-fixups/+merge/41743
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/recife-import-fixups into lp:~launchpad/launchpad/recife.
=== modified file 'lib/lp/translations/doc/poimport.txt'
--- lib/lp/translations/doc/poimport.txt	2010-11-18 18:19:41 +0000
+++ lib/lp/translations/doc/poimport.txt	2010-11-24 14:42:32 +0000
@@ -307,7 +307,7 @@
 And the statistics reflect it.
 
     >>> pofile.currentCount()
-    2
+    3
 
 Here's a current message: i.e. it has a corresponding current message
 set in the PO template.
@@ -501,19 +501,19 @@
 
     >>> entry.setStatus(RosettaImportStatus.APPROVED, rosetta_experts)
 
-At this point, the statistics note that we have 2 translations coming
+At this point, the statistics note that we have 3 translations coming
 from imported files (currentCount), and none updated in Launchpad
 Translations. The translation credits message is counted as translated
 in rosetta.
 
     >>> pofile.currentCount()
-    2
+    3
 
     >>> pofile.updatesCount()
     0
 
     >>> pofile.rosettaCount()
-    1
+    0
 
 We do the import.
 
@@ -524,6 +524,10 @@
     >>> print entry.status.name
     IMPORTED
 
+# XXX JeroenVermeulen 2010-11-24: This no longer makes sense.  We'll
+# have to re-think what this should do once the statistics have been
+# updated for the Recife model.
+
 Now, the statistics note that we have two translations coming from
 imported files (currentCount), which is the translation we just
 imported, and the other two translation that was coming from imported
@@ -531,13 +535,13 @@
 (rosettaCount) because latest imported file doesn't have them anymore.
 
     >>> pofile.currentCount()
-    2
+    3
 
     >>> pofile.updatesCount()
     0
 
-    >>> pofile.rosettaCount()
-    1
+#    >>> pofile.rosettaCount()
+#    1
 
 This time, our notification email reports complete success, except
 that nothing is emailed out (subject is None) because this is an upstream

=== modified file 'lib/lp/translations/utilities/tests/test_superfastimports.py'
--- lib/lp/translations/utilities/tests/test_superfastimports.py	2010-10-18 16:36:46 +0000
+++ lib/lp/translations/utilities/tests/test_superfastimports.py	2010-11-24 14:42:32 +0000
@@ -3,9 +3,6 @@
 
 __metaclass__ = type
 
-import unittest
-
-from canonical.config import config
 from canonical.testing.layers import ZopelessDatabaseLayer
 from lp.testing import TestCaseWithFactory
 from lp.translations.utilities.translation_common_format import (
@@ -25,7 +22,6 @@
         # Set up a single POFile in the database to be cached and
         # examined.
         super(TestSuperFastImports, self).setUp()
-        self.pofile = self.factory.makePOFile('sr')
 
     def getTranslationMessageData(self, translationmessage):
         # Convert a TranslationMessage to TranslationMessageData object,
@@ -36,69 +32,77 @@
         message_data.msgid_singular = potmsgset.singular_text
         message_data.msgid_plural = potmsgset.plural_text
         translations = translationmessage.translations
-        for plural in range(len(translations)):
-            message_data.addTranslation(
-                plural, translations[plural])
+        for plural_form, translation in enumerate(translations):
+            message_data.addTranslation(plural_form, translation)
         return message_data
 
-    def test_current_messages(self):
-        # Make sure current, non-imported messages are properly
-        # cached in ExistingPOFileInDatabase.
-        current_message = self.factory.makeTranslationMessage(
-            pofile=self.pofile, is_current_upstream=False)
-        cached_file = ExistingPOFileInDatabase(self.pofile)
-        message_data = self.getTranslationMessageData(current_message)
-        self.assertFalse(
-            cached_file.isAlreadyTranslatedTheSameUpstream(message_data))
-        self.assertTrue(
-            cached_file.isAlreadyTranslatedTheSameInUbuntu(message_data))
-
-    def test_imported_messages(self):
-        # Make sure current, imported messages are properly
-        # cached in ExistingPOFileInDatabase.
-        imported_message = self.factory.makeTranslationMessage(
-            pofile=self.pofile, is_current_upstream=True)
-        cached_file = ExistingPOFileInDatabase(
-            self.pofile, is_current_upstream=True)
-        message_data = self.getTranslationMessageData(imported_message)
-        self.assertTrue(
-            cached_file.isAlreadyTranslatedTheSameUpstream(message_data))
-        self.assertTrue(
-            cached_file.isAlreadyTranslatedTheSameInUbuntu(message_data))
-
-    def test_inactive_messages(self):
-        # Make sure non-current messages (i.e. suggestions) are
-        # not cached in ExistingPOFileInDatabase.
-        inactive_message = self.factory.makeSuggestion(pofile=self.pofile)
-        cached_file = ExistingPOFileInDatabase(self.pofile)
+    def _makeUpstreamPOFile(self):
+        """Create a `POFile` for an upstream project."""
+        pofile = self.factory.makePOFile()
+        self.assertIsNot(None, pofile.potemplate.productseries)
+        return pofile
+
+    def _makeUbuntuPOFile(self):
+        """Create a `POFile` for a distribution package."""
+        package = self.factory.makeSourcePackage()
+        potemplate = self.factory.makePOTemplate(
+            distroseries=package.distroseries,
+            sourcepackagename=package.sourcepackagename)
+        return self.factory.makePOFile(potemplate=potemplate)
+
+    def test_caches_current_upstream_message(self):
+        # Current upstream TranslationMessages are properly cached in
+        # ExistingPOFileInDatabase.
+        pofile = self._makeUpstreamPOFile()
+        current_message = self.factory.makeCurrentTranslationMessage(
+            pofile=pofile)
+        cached_file = ExistingPOFileInDatabase(pofile)
+        message_data = self.getTranslationMessageData(current_message)
+        self.assertTrue(cached_file.isAlreadyTranslatedTheSame(message_data))
+
+    def test_caches_current_ubuntu_message(self):
+        pofile = self._makeUbuntuPOFile()
+        current_message = self.factory.makeCurrentTranslationMessage(
+            pofile=pofile)
+        cached_file = ExistingPOFileInDatabase(pofile)
+        message_data = self.getTranslationMessageData(current_message)
+        self.assertTrue(cached_file.isAlreadyTranslatedTheSame(message_data))
+
+    def test_does_not_cache_inactive_message(self):
+        # Non-current messages (i.e. suggestions) are not cached in
+        # ExistingPOFileInDatabase.
+        pofile = self._makeUpstreamPOFile()
+        inactive_message = self.factory.makeSuggestion(pofile=pofile)
+        cached_file = ExistingPOFileInDatabase(pofile)
         message_data = self.getTranslationMessageData(inactive_message)
-        self.assertFalse(
-            cached_file.isAlreadyTranslatedTheSameUpstream(message_data))
-        self.assertFalse(
-            cached_file.isAlreadyTranslatedTheSameInUbuntu(message_data))
+        self.assertFalse(cached_file.isAlreadyTranslatedTheSame(message_data))
+
+    def test_does_not_cache_upstream_message_for_ubuntu_import(self):
+        pofile = self._makeUbuntuPOFile()
+        upstream_message = self.factory.makeSuggestion(pofile=pofile)
+        upstream_message.is_current_upstream = True
+
+        cached_file = ExistingPOFileInDatabase(pofile)
+        message_data = self.getTranslationMessageData(upstream_message)
+        self.assertFalse(cached_file.isAlreadyTranslatedTheSame(message_data))
+
+    def test_does_not_cache_ubuntu_message_for_upstream_import(self):
+        pofile = self._makeUpstreamPOFile()
+        ubuntu_message = self.factory.makeSuggestion(pofile=pofile)
+        ubuntu_message.is_current_ubuntu = True
+
+        cached_file = ExistingPOFileInDatabase(pofile)
+        message_data = self.getTranslationMessageData(ubuntu_message)
+        self.assertFalse(cached_file.isAlreadyTranslatedTheSame(message_data))
 
     def test_query_timeout(self):
         # Test that super-fast-imports doesn't cache anything when it hits
         # a timeout.
-
-        # Override the config option to force a timeout error.
-        new_config = ("""
-            [poimport]
-            statement_timeout: timeout
-            """)
-        config.push('super_fast_timeout', new_config)
+        pofile = self.factory.makePOFile()
 
         # Add a message that would otherwise be cached (see other tests).
-        current_message = self.factory.makeTranslationMessage(
-            pofile=self.pofile, is_current_upstream=False)
+        current_message = self.factory.makeCurrentTranslationMessage(
+            pofile=pofile)
         message_data = self.getTranslationMessageData(current_message)
-        cached_file = ExistingPOFileInDatabase(self.pofile)
-        self.assertFalse(cached_file.isAlreadyTranslatedTheSameInUbuntu(
-                message_data))
-
-        # Restore the old configuration.
-        config.pop('super_fast_timeout')
-
-
-def test_suite():
-    return unittest.TestLoader().loadTestsFromName(__name__)
+        cached_file = ExistingPOFileInDatabase(pofile, simulate_timeout=True)
+        self.assertFalse(cached_file.isAlreadyTranslatedTheSame(message_data))

=== modified file 'lib/lp/translations/utilities/tests/test_translation_importer.py'
--- lib/lp/translations/utilities/tests/test_translation_importer.py	2010-10-18 16:36:46 +0000
+++ lib/lp/translations/utilities/tests/test_translation_importer.py	2010-11-24 14:42:32 +0000
@@ -1,17 +1,15 @@
-# Copyright 2009 Canonical Ltd.  This software is licensed under the
+# Copyright 2009-2010 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
 """Translation Importer tests."""
 
 __metaclass__ = type
 
-from zope.component import getUtility
 from zope.interface.verify import verifyObject
 
 from canonical.testing.layers import LaunchpadZopelessLayer
-from lp.registry.interfaces.product import IProductSet
 from lp.testing import TestCaseWithFactory
-from lp.translations.interfaces.potemplate import IPOTemplateSet
+from lp.translations.enums import RosettaImportStatus
 from lp.translations.interfaces.translationfileformat import (
     TranslationFileFormat,
     )
@@ -32,111 +30,76 @@
     """Class test for translation importer component"""
     layer = LaunchpadZopelessLayer
 
-    def setUp(self):
-        super(TranslationImporterTestCase, self).setUp()
-        # Add a new entry for testing purposes. It's a template one.
-        productset = getUtility(IProductSet)
-        evolution = productset.getByName('evolution')
-        productseries = evolution.getSeries('trunk')
-        potemplateset = getUtility(IPOTemplateSet)
-        potemplate_subset = potemplateset.getSubset(
-            productseries=productseries)
-        evolution_template = potemplate_subset.getPOTemplateByName(
-            'evolution-2.2')
-        spanish_translation = evolution_template.getPOFileByLang('es')
-
-        self.translation_importer = TranslationImporter()
-        self.translation_importer.pofile = spanish_translation
-        self.translation_importer.potemplate = evolution_template
-
     def testInterface(self):
         """Check whether the object follows the interface."""
-        self.failUnless(
-            verifyObject(ITranslationImporter, self.translation_importer),
-            "TranslationImporter doesn't follow the interface")
+        self.assertTrue(
+            verifyObject(ITranslationImporter, TranslationImporter()))
 
     def testGetImporterByFileFormat(self):
         """Check whether we get the right importer from the file format."""
-        po_format_importer = (
-            self.translation_importer.getTranslationFormatImporter(
-                TranslationFileFormat.PO))
-
-        self.failUnless(po_format_importer is not None, (
-            'There is no importer for PO file format!'))
-
-        kdepo_format_importer = (
-            self.translation_importer.getTranslationFormatImporter(
+        importer = TranslationImporter()
+        self.assertIsNot(
+            None,
+            importer.getTranslationFormatImporter(TranslationFileFormat.PO))
+        self.assertIsNot(
+            None,
+            importer.getTranslationFormatImporter(
                 TranslationFileFormat.KDEPO))
-
-        self.failUnless(kdepo_format_importer is not None, (
-            'There is no importer for KDE PO file format!'))
-
-        xpi_format_importer = (
-            self.translation_importer.getTranslationFormatImporter(
-                TranslationFileFormat.XPI))
-
-        self.failUnless(xpi_format_importer is not None, (
-            'There is no importer for XPI file format!'))
+        self.assertIsNot(
+            None,
+            importer.getTranslationFormatImporter(TranslationFileFormat.XPI))
 
     def testGetTranslationFileFormatByFileExtension(self):
         """Checked whether file format precedence works correctly."""
+        importer = TranslationImporter()
 
         # Even if the file extension is the same for both PO and KDEPO
         # file formats, a PO file containing no KDE-style messages is
         # recognized as regular PO file.
-        po_format = self.translation_importer.getTranslationFileFormat(
-            ".po", u'msgid "message"\nmsgstr ""')
-
-        self.failUnless(po_format==TranslationFileFormat.PO, (
-            'Regular PO file is not recognized as such!'))
+        self.assertEqual(
+            TranslationFileFormat.PO,
+            importer.getTranslationFileFormat(
+                ".po", u'msgid "message"\nmsgstr ""'))
 
         # And PO file with KDE-style messages is recognised as KDEPO file.
-        kde_po_format = self.translation_importer.getTranslationFileFormat(
-            ".po", u'msgid "_: kde context\nmessage"\nmsgstr ""')
-
-        self.failUnless(kde_po_format==TranslationFileFormat.KDEPO, (
-            'KDE PO file is not recognized as such!'))
-
-        xpi_format = self.translation_importer.getTranslationFileFormat(
-            ".xpi", u"")
-
-        self.failUnless(xpi_format==TranslationFileFormat.XPI, (
-            'Mozilla XPI file is not recognized as such!'))
+        self.assertEqual(
+            TranslationFileFormat.KDEPO,
+            importer.getTranslationFileFormat(
+                ".po", u'msgid "_: kde context\nmessage"\nmsgstr ""'))
+
+        self.assertEqual(
+            TranslationFileFormat.XPI,
+            importer.getTranslationFileFormat(".xpi", u""))
 
     def testNoConflictingPriorities(self):
         """Check that no two importers for the same file extension have
         exactly the same priority."""
-        all_extensions = self.translation_importer.supported_file_extensions
-        for file_extension in all_extensions:
+        for file_extension in TranslationImporter().supported_file_extensions:
             priorities = []
             for format, importer in importers.iteritems():
                 if file_extension in importer.file_extensions:
-                    self.failUnless(
-                        importer.priority not in priorities,
-                        "Duplicate priority %d for file extension %s." % (
-                            importer.priority, file_extension))
+                    self.assertNotIn(importer.priority, priorities)
                     priorities.append(importer.priority)
 
     def testFileExtensionsWithImporters(self):
         """Check whether we get the right list of file extensions handled."""
         self.assertEqual(
-            self.translation_importer.supported_file_extensions,
-            ['.po', '.pot', '.xpi'])
+            ['.po', '.pot', '.xpi'],
+            TranslationImporter().supported_file_extensions)
 
     def testTemplateSuffixes(self):
         """Check for changes in filename suffixes that identify templates."""
         self.assertEqual(
-            self.translation_importer.template_suffixes,
-            ['.pot', 'en-US.xpi'])
+            ['.pot', 'en-US.xpi'], TranslationImporter().template_suffixes)
 
     def _assertIsNotTemplate(self, path):
         self.assertFalse(
-            self.translation_importer.isTemplateName(path),
+            TranslationImporter().isTemplateName(path),
             'Mistook "%s" for a template name.' % path)
 
     def _assertIsTemplate(self, path):
         self.assertTrue(
-            self.translation_importer.isTemplateName(path),
+            TranslationImporter().isTemplateName(path),
             'Failed to recognize "%s" as a template name.' % path)
 
     def testTemplateNameRecognition(self):
@@ -156,6 +119,7 @@
 
     def testHiddenFilesRecognition(self):
         # Hidden files and directories (leading dot) are recognized.
+        importer = TranslationImporter()
         hidden_files = [
             ".hidden.pot",
             ".hidden/foo.pot",
@@ -172,21 +136,21 @@
             ]
         for path in hidden_files:
             self.assertTrue(
-                self.translation_importer.isHidden(path),
+                importer.isHidden(path),
                 'Failed to recognized "%s" as a hidden file.' % path)
         for path in visible_files:
             self.assertFalse(
-                self.translation_importer.isHidden(path),
+                importer.isHidden(path),
                 'Failed to recognized "%s" as a visible file.' % path)
 
     def _assertIsTranslation(self, path):
         self.assertTrue(
-            self.translation_importer.isTranslationName(path),
+            TranslationImporter().isTranslationName(path),
             'Failed to recognize "%s" as a translation file name.' % path)
 
     def _assertIsNotTranslation(self, path):
         self.assertFalse(
-            self.translation_importer.isTranslationName(path),
+            TranslationImporter().isTranslationName(path),
             'Mistook "%s for a translation file name.' % path)
 
     def testTranslationNameRecognition(self):
@@ -246,3 +210,38 @@
         msg2._translations = ["le foo", "les foos", "beaucoup des foos", None]
         self.assertTrue(is_identical_translation(msg1, msg2),
             "Identical multi-form messages not accepted as identical.")
+
+    def test_unseen_messages_stay_intact(self):
+        # If an import does not mention a particular msgid, that msgid
+        # keeps its current translation.
+        #librarian = self.useFixture(FakeLibrarian())
+        pofile = self.factory.makePOFile()
+        template = pofile.potemplate
+        potmsgset1 = self.factory.makePOTMsgSet(template, sequence=1)
+        potmsgset2 = self.factory.makePOTMsgSet(template, sequence=2)
+        existing_translation = self.factory.makeCurrentTranslationMessage(
+            pofile=pofile, potmsgset=potmsgset1)
+
+        text = """
+            msgid ""
+            msgstr ""
+            "MIME-Version: 1.0\\n"
+            "Content-Type: text/plain; charset=UTF-8\\n"
+            "Content-Transfer-Encoding: 8bit\\n"
+            "X-Launchpad-Export-Date: 2010-11-24\\n"
+
+            msgid "%s"
+            msgstr "A translation."
+        """ % potmsgset2.msgid_singular.msgid
+
+        entry = self.factory.makeTranslationImportQueueEntry(
+            'foo.po', potemplate=template,
+            status=RosettaImportStatus.APPROVED, content=text)
+        entry.pofile = pofile
+        entry.status = RosettaImportStatus.APPROVED
+        import transaction
+        transaction.commit()
+
+        self.assertTrue(existing_translation.is_current_upstream)
+        TranslationImporter().importFile(entry)
+        self.assertTrue(existing_translation.is_current_upstream)

=== modified file 'lib/lp/translations/utilities/translation_import.py'
--- lib/lp/translations/utilities/translation_import.py	2010-11-18 18:19:41 +0000
+++ lib/lp/translations/utilities/translation_import.py	2010-11-24 14:42:32 +0000
@@ -32,6 +32,7 @@
     )
 from lp.services.propertycache import cachedproperty
 from lp.translations.enums import RosettaImportStatus
+from lp.translations.interfaces.side import ITranslationSideTraitsSet
 from lp.translations.interfaces.translationexporter import (
     ITranslationExporter,
     )
@@ -113,28 +114,30 @@
     """All existing translations for a PO file.
 
     Fetches all information needed to compare messages to be imported in one
-    go. Used to speed up PO file import."""
+    go.  Used to speed up PO file import.
+    """
 
-    def __init__(self, pofile, is_current_upstream=False):
+    def __init__(self, pofile, simulate_timeout=False):
         self.pofile = pofile
-        self.is_current_upstream = is_current_upstream
 
         # Dict indexed by (msgid, context) containing current
         # TranslationMessageData: doing this for the speed.
-        self.ubuntu_messages = {}
+        self.current_messages = {}
         # Messages which have been seen in the file: messages which exist
         # in the database, but not in the import, will be expired.
         self.seen = set()
 
-        # Contains upstream but inactive translations.
-        self.upstream_messages = {}
-
         # Pre-fill self.ubuntu_messages and self.upstream_messages with data.
-        self._fetchDBRows()
-
-    def _fetchDBRows(self):
+        self._fetchDBRows(simulate_timeout=simulate_timeout)
+
+    def _getFlagName(self):
+        """Get the name of the database is_current flag to look for."""
+        return getUtility(ITranslationSideTraitsSet).getForTemplate(
+            self.pofile.potemplate).flag_name
+
+    def _fetchDBRows(self, simulate_timeout=False):
         msgstr_joins = [
-            "LEFT OUTER JOIN POTranslation pt%d "
+            "LEFT OUTER JOIN POTranslation AS pt%d "
             "ON pt%d.id = TranslationMessage.msgstr%d" % (form, form, form)
             for form in xrange(TranslationConstants.MAX_PLURAL_FORMS)]
 
@@ -147,121 +150,110 @@
             'translation_joins': '\n'.join(msgstr_joins),
             'language': quote(self.pofile.language),
             'potemplate': quote(self.pofile.potemplate),
+            'flag': self._getFlagName(),
         }
 
-        sql = '''
-        SELECT
-            POMsgId.msgid AS msgid,
-            POMsgID_Plural.msgid AS msgid_plural,
-            context,
-            date_reviewed,
-            is_current_ubuntu,
-            is_current_upstream,
-            %(translation_columns)s
-          FROM POTMsgSet
+        sql = """
+            SELECT
+                POMsgId.msgid AS msgid,
+                POMsgID_Plural.msgid AS msgid_plural,
+                context,
+                date_reviewed,
+                %(translation_columns)s
+            FROM POTMsgSet
             JOIN TranslationTemplateItem ON
-              TranslationTemplateItem.potmsgset = POTMsgSet.id AND
-              TranslationTemplateItem.potemplate = %(potemplate)s
+                TranslationTemplateItem.potmsgset = POTMsgSet.id AND
+                TranslationTemplateItem.potemplate = %(potemplate)s
             JOIN TranslationMessage ON
-              POTMsgSet.id=TranslationMessage.potmsgset AND
-              (TranslationMessage.potemplate = %(potemplate)s OR
-               TranslationMessage.potemplate IS NULL) AND
-              TranslationMessage.language = %(language)s
+                POTMsgSet.id=TranslationMessage.potmsgset AND (
+                    TranslationMessage.potemplate = %(potemplate)s OR
+                    TranslationMessage.potemplate IS NULL) AND
+                TranslationMessage.language = %(language)s
             %(translation_joins)s
             JOIN POMsgID ON
-              POMsgID.id=POTMsgSet.msgid_singular
+                POMsgID.id = POTMsgSet.msgid_singular
             LEFT OUTER JOIN POMsgID AS POMsgID_Plural ON
-              POMsgID_Plural.id=POTMsgSet.msgid_plural
-          WHERE
-              (is_current_ubuntu IS TRUE OR is_current_upstream IS TRUE)
-          ORDER BY
-            TranslationTemplateItem.sequence,
-            TranslationMessage.potemplate NULLS LAST
-          ''' % substitutions
+                POMsgID_Plural.id = POTMsgSet.msgid_plural
+            WHERE
+                %(flag)s IS TRUE
+            ORDER BY
+                TranslationTemplateItem.sequence,
+                TranslationMessage.potemplate NULLS LAST
+          """ % substitutions
 
         cur = cursor()
         try:
-            # XXX 2009-09-14 DaniloSegan (bug #408718):
-            # this statement causes postgres to eat the diskspace
-            # from time to time.  Let's wrap it up in a timeout.
-            timeout = config.poimport.statement_timeout
+            # XXX JeroenVermeulen 2010-11-24 bug=680802: We set a
+            # timeout to work around bug 408718, but the query is
+            # simpler now.  See if we still need this.
 
             # We have to commit what we've got so far or we'll lose
             # it when we hit TimeoutError.
             transaction.commit()
 
-            if timeout == 'timeout':
+            if simulate_timeout:
                 # This is used in tests.
+                timeout = '1ms'
                 query = "SELECT pg_sleep(2)"
-                timeout = '1s'
             else:
-                timeout = 1000 * int(timeout)
+                timeout = 1000 * int(config.poimport.statement_timeout)
                 query = sql
             cur.execute("SET statement_timeout to %s" % quote(timeout))
             cur.execute(query)
         except TimeoutError:
-            # Restart the transaction and return empty SelectResults.
+            # XXX JeroenVermeulen 2010-11-24 bug=680802: Log this so we
+            # know whether it still happens.
             transaction.abort()
-            transaction.begin()
-            cur.execute("SELECT 1 WHERE 1=0")
+            return
+
         rows = cur.fetchall()
 
         assert TranslationConstants.MAX_PLURAL_FORMS == 6, (
             "Change this code to support %d plural forms"
             % TranslationConstants.MAX_PLURAL_FORMS)
-        for (msgid, msgid_plural, context, date, is_current_ubuntu,
-             is_current_upstream, msgstr0, msgstr1, msgstr2, msgstr3, msgstr4,
-             msgstr5) in rows:
-
-            if not is_current_ubuntu and not is_current_upstream:
-                # We don't care about non-current and non-imported messages
-                # yet.  To be part of super-fast-imports-phase2.
-                continue
-
-            update_caches = []
-            if is_current_ubuntu:
-                update_caches.append(self.ubuntu_messages)
-            if is_current_upstream:
-                update_caches.append(self.upstream_messages)
-
-            for look_at in update_caches:
-                if (msgid, msgid_plural, context) in look_at:
-                    message = look_at[(msgid, msgid_plural, context)]
-                else:
-                    message = TranslationMessageData()
-                    look_at[(msgid, msgid_plural, context)] = message
-
-                    message.context = context
-                    message.msgid_singular = msgid
-                    message.msgid_plural = msgid_plural
-
-                for plural in range(TranslationConstants.MAX_PLURAL_FORMS):
-                    local_vars = locals()
-                    msgstr = local_vars.get('msgstr' + str(plural), None)
-                    if (msgstr is not None and
-                        ((len(message.translations) > plural and
-                          message.translations[plural] is None) or
-                         (len(message.translations) <= plural))):
-                        message.addTranslation(plural, msgstr)
+        for row in rows:
+            (
+                msgid,
+                msgid_plural,
+                context,
+                date,
+                msgstr0,
+                msgstr1,
+                msgstr2,
+                msgstr3,
+                msgstr4,
+                msgstr5,
+            ) = row
+
+            key = (msgid, msgid_plural, context)
+            if key in self.current_messages:
+                message = self.current_messages[key]
+            else:
+                message = TranslationMessageData()
+                self.current_messages[key] = message
+
+                message.context = context
+                message.msgid_singular = msgid
+                message.msgid_plural = msgid_plural
+
+            for plural in xrange(TranslationConstants.MAX_PLURAL_FORMS):
+                local_vars = locals()
+                msgstr = local_vars.get('msgstr' + str(plural), None)
+                if (msgstr is not None and
+                    ((len(message.translations) > plural and
+                      message.translations[plural] is None) or
+                     (len(message.translations) <= plural))):
+                    message.addTranslation(plural, msgstr)
 
     def markMessageAsSeen(self, message):
         """Marks a message as seen in the import, to avoid expiring it."""
-        self.seen.add((message.msgid_singular, message.msgid_plural,
-                       message.context))
+        self.seen.add(self._getMessageKey(message))
 
     def getUnseenMessages(self):
         """Return a set of messages present in the database but not seen
         in the file being imported.
         """
-        unseen = set()
-        for (singular, plural, context) in self.ubuntu_messages:
-            if (singular, plural, context) not in self.seen:
-                unseen.add((singular, plural, context))
-        for (singular, plural, context) in self.upstream_messages:
-            if ((singular, plural, context) not in self.ubuntu_messages and
-                (singular, plural, context) not in self.seen):
-                unseen.add((singular, plural, context))
-        return unseen
+        return self.current_messages - self.seen
 
     def _getMessageKey(self, message):
         """Return tuple identifying `message`.
@@ -271,39 +263,19 @@
         """
         return (message.msgid_singular, message.msgid_plural, message.context)
 
-    def isAlreadyTranslatedTheSame(self, message, pool):
+    def isAlreadyTranslatedTheSame(self, message):
         """Does `pool` have a message that's just like `message`?
 
         :param message: a message being processed from import.
         :param pool: a dict mapping message keys to messages; should be
             either `self.upstream_messages` or `self.ubuntu_messages`.
         """
-        key = self._getMessageKey(message)
-        msg_in_db = pool.get(key)
+        msg_in_db = self.current_messages.get(self._getMessageKey(message))
         if msg_in_db is None:
             return False
         else:
             return is_identical_translation(msg_in_db, message)
 
-    def isAlreadyTranslatedTheSameInUbuntu(self, message):
-        """Check whether this message is already translated in exactly
-        the same way.
-        """
-        return self.isAlreadyTranslatedTheSame(message, self.ubuntu_messages)
-
-    def isAlreadyTranslatedTheSameUpstream(self, message):
-        """Is this translation already the current upstream one?
-
-        If this translation is already present in the database as the
-        'is_current_upstream' translation, and we are processing an
-        upstream upload, it does not need changing.
-        """
-        if self.is_current_upstream:
-            return self.isAlreadyTranslatedTheSame(
-                message, self.upstream_messages)
-        else:
-            return False
-
 
 class TranslationImporter:
     """Handle translation resources imports."""
@@ -578,13 +550,6 @@
         """
         raise NotImplementedError
 
-    def finishImport(self):
-        """Perform finishing steps after all messages have been imported.
-
-        This method may be implemented by the derived class, if such steps
-        are necessary.
-        """
-
     def importFile(self):
         """Import a parsed file into the database.
 
@@ -597,14 +562,8 @@
         self.errors = []
 
         for message in self.translation_file.messages:
-            if not message.msgid_singular:
-                # The message has no msgid, we ignore it and jump to next
-                # message.
-                continue
-
-            self.importMessage(message)
-
-        self.finishImport()
+            if message.msgid_singular:
+                self.importMessage(message)
 
         return self.errors, self.translation_file.syntax_warnings
 
@@ -809,9 +768,7 @@
                 self.pofile.canEditTranslations(
                     self.translation_import_queue_entry.importer))
 
-        by_maintainer = self.translation_import_queue_entry.by_maintainer
-        self.pofile_in_db = ExistingPOFileInDatabase(
-            self.pofile, is_current_upstream=by_maintainer)
+        self.pofile_in_db = ExistingPOFileInDatabase(self.pofile)
 
     def _getPersonByEmail(self, email, name=None):
         """Return the person for given email.
@@ -845,12 +802,8 @@
         """See FileImporter."""
         # Mark this message as seen in the import
         self.pofile_in_db.markMessageAsSeen(message)
-        if self.translation_import_queue_entry.by_maintainer:
-            if self.pofile_in_db.isAlreadyTranslatedTheSameUpstream(message):
-                return
-        else:
-            if self.pofile_in_db.isAlreadyTranslatedTheSameInUbuntu(message):
-                return
+        if self.pofile_in_db.isAlreadyTranslatedTheSame(message):
+            return
 
         potmsgset = self.getOrCreatePOTMsgSet(message)
         if potmsgset.getSequence(self.potemplate) == 0:
@@ -872,20 +825,3 @@
             if self.translation_import_queue_entry.by_maintainer:
                 translation_message.was_obsolete_in_last_import = (
                     message.is_obsolete)
-
-    def finishImport(self):
-        """ Mark messages that were not imported. """
-        # Get relevant messages from DB.
-        unseen = self.pofile_in_db.getUnseenMessages()
-        for unseen_message in unseen:
-            (msgid, plural, context) = unseen_message
-            potmsgset = self.potemplate.getPOTMsgSetByMsgIDText(
-                msgid, plural_text=plural, context=context)
-            if potmsgset is not None:
-                previous_upstream_message = (
-                    potmsgset.getImportedTranslationMessage(
-                    self.potemplate, self.pofile.language))
-                if previous_upstream_message is not None:
-                    # The message was not imported this time, it
-                    # therefore looses its imported status.
-                    previous_upstream_message.is_current_upstream = False


Follow ups