← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~danilo/launchpad/bug-534203 into lp:launchpad/db-devel

 

Данило Шеган has proposed merging lp:~danilo/launchpad/bug-534203 into lp:launchpad/db-devel.

Requested reviews:
  Stuart Bishop (stub)
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #534203 in Launchpad itself: "Timeouts on POFile:+filter (filter by person)"
  https://bugs.launchpad.net/launchpad/+bug/534203

For more details, see:
https://code.launchpad.net/~danilo/launchpad/bug-534203/+merge/65467

= Bug 534203 =

Doing a query for all translationmessages by a certain person on a particular pofile (template, language) at the moment makes use of tm__language__submitter__idx index on translationmessage table. However, for particularly active submitters who have done many translations across the board, this means that there will be plenty of rows matching a language and a submitter (not surprisingly, one submitter usually takes care only of one language), thus we'll be doing filtering on potmsgset of a large set of rows.

Simply dropping the index improves the performance hugely, since we then start using translationmessage__potmsgset__language__idx, thus narrowing down to a smaller set before doing the filtering on submitter. In practice, for contributors with a large number of submissions, this has brought down the query time roughly 10x in testing on staging (and even more when the caches are cold: from 30s to 1-2s; with warm caches, from ~300-700ms to ~35-65ms).

In theory, this might limit our best case performance for smaller contributors which have only worked on a single project, but considering the upper bound for that is the potemplate size (or 45k rows which we have in only one template, and other than perhaps 5 other templates, all the rest are sub-10k rows), our worst-case performance should be bound by a more reasonable time limit instead.

I've also refactored the POFile.getTranslationsFilteredBy to use the Storm syntax, and refactored the unit test to split it into several by the behaviour being tested.

= Tests =

bin/test -cvvt getTranslationsFilteredBy

= Demo & QA =

https://translations.(qa)staging.launchpad.net/ubuntu/hardy/+source/evolution-data-server/+pots/evolution-data-server-2.22/it/+filter?person=elle.uca

https://translations.(qa)staging.launchpad.net/ubuntu/hardy/+source/evolution-data-server/+pots/evolution-data-server-2.22/sr/+filter?person=danilo

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  database/schema/patch-2208-99-1.sql
  lib/lp/translations/model/pofile.py
  lib/lp/translations/tests/test_pofile.py

-- 
https://code.launchpad.net/~danilo/launchpad/bug-534203/+merge/65467
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~danilo/launchpad/bug-534203 into lp:launchpad/db-devel.
=== added file 'database/schema/patch-2208-99-1.sql'
--- database/schema/patch-2208-99-1.sql	1970-01-01 00:00:00 +0000
+++ database/schema/patch-2208-99-1.sql	2011-06-22 10:23:39 +0000
@@ -0,0 +1,13 @@
+-- Copyright 2011 Canonical Ltd.  This software is licensed under the
+-- GNU Affero General Public License version 3 (see the file LICENSE).
+
+SET client_min_messages=ERROR;
+
+-- Bug 534203: this index makes POFile:+filter query use it even
+-- when there's a better index on (potmsgset, language) to use.
+-- It doesn't make sense anywhere else, but if we do turn out to
+-- need it somewhere, we should include potmsgset in it as well.
+-- (And perhaps drop translationmessage__potmsgset__language__idx).
+DROP INDEX tm__language__submitter__idx;
+
+INSERT INTO LaunchpadDatabaseRevision VALUES (2208, 99, 1);

=== modified file 'lib/lp/translations/model/pofile.py'
--- lib/lp/translations/model/pofile.py	2011-05-27 21:12:25 +0000
+++ lib/lp/translations/model/pofile.py	2011-06-22 10:23:39 +0000
@@ -26,6 +26,7 @@
 from storm.expr import (
     And,
     Coalesce,
+    Desc,
     Exists,
     Join,
     LeftJoin,
@@ -548,6 +549,9 @@
 
         Call-site will have to have appropriate clauseTables.
         """
+        # When all the code that uses this method is moved to Storm,
+        # we can replace it with _getStormClausesForPOFileMessages
+        # and then remove it.
         clauses = [
             'TranslationTemplateItem.potemplate = %s' % sqlvalues(
                 self.potemplate),
@@ -559,17 +563,33 @@
 
         return clauses
 
+    def _getStormClausesForPOFileMessages(self, current=True):
+        """Get TranslationMessages for the POFile via TranslationTemplateItem.
+        """
+        clauses = [
+            TranslationTemplateItem.potemplate == self.potemplate,
+            (TranslationTemplateItem.potmsgsetID ==
+             TranslationMessage.potmsgsetID),
+            TranslationMessage.language == self.language,
+            ]
+        if current:
+            clauses.append(TranslationTemplateItem.sequence > 0)
+
+        return clauses
+
     def getTranslationsFilteredBy(self, person):
         """See `IPOFile`."""
         assert person is not None, "You must provide a person to filter by."
-        clauses = self._getClausesForPOFileMessages(current=False)
+        clauses = self._getStormClausesForPOFileMessages(current=False)
         clauses.append(
-            'TranslationMessage.submitter = %s' % sqlvalues(person))
+            TranslationMessage.submitter == person)
 
-        return TranslationMessage.select(
-            " AND ".join(clauses),
-            clauseTables=['TranslationTemplateItem'],
-            orderBy=['sequence', '-date_created'])
+        results = Store.of(self).find(
+            TranslationMessage,
+            *clauses)
+        return results.order_by(
+            TranslationTemplateItem.sequence,
+            Desc(TranslationMessage.date_created))
 
     def _getTranslatedMessagesQuery(self):
         """Get query data for fetching all POTMsgSets with translations.

=== modified file 'lib/lp/translations/tests/test_pofile.py'
--- lib/lp/translations/tests/test_pofile.py	2011-05-27 21:12:25 +0000
+++ lib/lp/translations/tests/test_pofile.py	2011-06-22 10:23:39 +0000
@@ -331,10 +331,9 @@
             self.devel_pofile.findPOTMsgSetsContaining(u"THIRD"))
         self.assertEquals(found_potmsgsets, [plural_potmsgset])
 
-    def test_getTranslationsFilteredBy(self):
-        # Test that filtering by submitters works.
-
-        potmsgset = self.potmsgset
+    def test_getTranslationsFilteredBy_none(self):
+        # When a person has submitted no translations, empty result set
+        # is returned.
 
         # A person to be submitting all translations.
         submitter = self.factory.makePerson()
@@ -344,7 +343,13 @@
             self.devel_pofile.getTranslationsFilteredBy(submitter))
         self.assertEquals(found_translations, [])
 
-        # If 'submitter' provides a translation, it's returned in a list.
+    def test_getTranslationsFilteredBy(self):
+        # If 'submitter' provides a translation for a pofile,
+        # it's returned in a list.
+
+        potmsgset = self.potmsgset
+        submitter = self.factory.makePerson()
+
         translation = self.factory.makeCurrentTranslationMessage(
             pofile=self.devel_pofile, potmsgset=potmsgset,
             translations=[u"Translation message"],
@@ -353,8 +358,13 @@
             self.devel_pofile.getTranslationsFilteredBy(submitter))
         self.assertEquals(found_translations, [translation])
 
+    def test_getTranslationsFilteredBy_someone_else(self):
         # If somebody else provides a translation, it's not added to the
         # list of submitter's translations.
+
+        potmsgset = self.potmsgset
+        submitter = self.factory.makePerson()
+
         someone_else = self.factory.makePerson()
         self.factory.makeCurrentTranslationMessage(
             pofile=self.devel_pofile, potmsgset=potmsgset,
@@ -362,33 +372,46 @@
             translator=someone_else)
         found_translations = list(
             self.devel_pofile.getTranslationsFilteredBy(submitter))
-        self.assertEquals(found_translations, [translation])
+        self.assertEquals(found_translations, [])
 
-        # Adding a translation for same POTMsgSet, but to a different
+    def test_getTranslationsFilteredBy_other_pofile(self):
+        # Adding a translation for the same POTMsgSet, but to a different
         # POFile (different language) will not add the translation
-        # to the list of submitter's translations for *former* POFile.
-        serbian_latin = self.factory.makeLanguage(
-            'sr@latin', 'Serbian Latin')
-
-        self.devel_sr_latin_pofile = self.factory.makePOFile(
-            'sr@latin', potemplate=self.devel_potemplate)
+        # to the list of submitter's translations for *original* POFile.
+
+        potmsgset = self.potmsgset
+        submitter = self.factory.makePerson()
+
+        self.factory.makeLanguage('sr@test', 'Serbian Test')
+
+        self.devel_sr_test_pofile = self.factory.makePOFile(
+            'sr@test', potemplate=self.devel_potemplate)
         self.factory.makeCurrentTranslationMessage(
-            pofile=self.devel_sr_latin_pofile, potmsgset=potmsgset,
+            pofile=self.devel_sr_test_pofile, potmsgset=potmsgset,
             translations=[u"Yet another translation"],
             translator=submitter)
         found_translations = list(
             self.devel_pofile.getTranslationsFilteredBy(submitter))
-        self.assertEquals(found_translations, [translation])
+        self.assertEquals(found_translations, [])
 
+    def test_getTranslationsFilteredBy_shared(self):
         # If a POTMsgSet is shared between two templates, a
-        # translation is listed on both.
+        # translation on one is listed on both.
+
+        potmsgset = self.potmsgset
+        submitter = self.factory.makePerson()
+        translation = self.factory.makeCurrentTranslationMessage(
+            pofile=self.devel_pofile, potmsgset=potmsgset,
+            translations=[u"Translation message"],
+            translator=submitter)
+
         potmsgset.setSequence(self.stable_potemplate, 1)
-        found_translations = list(
+        stable_translations = list(
             self.stable_pofile.getTranslationsFilteredBy(submitter))
-        self.assertEquals(found_translations, [translation])
-        found_translations = list(
+        self.assertEquals(stable_translations, [translation])
+        devel_translations = list(
             self.devel_pofile.getTranslationsFilteredBy(submitter))
-        self.assertEquals(found_translations, [translation])
+        self.assertEquals(devel_translations, [translation])
 
     def test_getPOTMsgSetTranslated_NoShared(self):
         # Test listing of translated POTMsgSets when there is no shared
@@ -817,7 +840,7 @@
         self.factory.makeCurrentTranslationMessage(
             pofile=self.devel_pofile, potmsgset=potmsgset,
             translations=[u"Imported translation"], current_other=True)
-        translation = self.factory.makeCurrentTranslationMessage(
+        self.factory.makeCurrentTranslationMessage(
             pofile=self.devel_pofile, potmsgset=potmsgset,
             translations=[u"LP translation"], current_other=False)
 
@@ -935,7 +958,8 @@
         self.assertEqual(None, devel_potemplate.getPOFileByLang('eo'))
         self.assertEqual(None, stable_potemplate.getPOFileByLang('eo'))
 
-        package_pofile = package_potemplate.newPOFile('eo')
+        # Package PO file.
+        package_potemplate.newPOFile('eo')
 
         devel_pofile = devel_potemplate.getPOFileByLang('eo')
         self.assertNotEqual(None, devel_pofile)
@@ -972,7 +996,8 @@
         self.assertEqual(None, distroseries1_potemplate.getPOFileByLang('eo'))
         self.assertEqual(None, distroseries2_potemplate.getPOFileByLang('eo'))
 
-        stable_pofile = stable_potemplate.newPOFile('eo')
+        # Stable PO file.
+        stable_potemplate.newPOFile('eo')
 
         distroseries1_pofile = distroseries1_potemplate.getPOFileByLang('eo')
         self.assertNotEqual(None, distroseries1_pofile)
@@ -1051,7 +1076,7 @@
         translator = self.factory.makePerson(
             name=u'the-translator',
             displayname=u'Launchpad Translator')
-        upstream_credits = self.credits_potmsgset.setCurrentTranslation(
+        self.credits_potmsgset.setCurrentTranslation(
             self.pofile, translator, {0: 'upstream credits'},
             RosettaTranslationOrigin.SCM, share_with_other_side=True)
         self.assertEquals(
@@ -1109,7 +1134,7 @@
         kde_names_potmsgset = self.factory.makePOTMsgSet(
             potemplate=self.potemplate,
             singular=u'_: NAME OF TRANSLATORS\nYour names')
-        upstream_credits = kde_names_potmsgset.setCurrentTranslation(
+        kde_names_potmsgset.setCurrentTranslation(
             self.pofile, translator,
             {0: 'Upstream credits'},
             RosettaTranslationOrigin.SCM, share_with_other_side=True)
@@ -1126,7 +1151,7 @@
         kde_emails_potmsgset = self.factory.makePOTMsgSet(
             potemplate=self.potemplate,
             singular=u'_: EMAIL OF TRANSLATORS\nYour emails')
-        upstream_credits = kde_emails_potmsgset.setCurrentTranslation(
+        kde_emails_potmsgset.setCurrentTranslation(
             self.pofile, translator,
             {0: 'translator@upstream'},
             RosettaTranslationOrigin.SCM, share_with_other_side=True)
@@ -1143,7 +1168,7 @@
             potemplate=self.potemplate,
             context=u'NAME OF TRANSLATORS',
             singular=u'Your names')
-        upstream_credits = kde_names_potmsgset.setCurrentTranslation(
+        kde_names_potmsgset.setCurrentTranslation(
             self.pofile, translator,
             {0: 'Upstream credits'},
             RosettaTranslationOrigin.SCM, share_with_other_side=True)
@@ -1161,7 +1186,7 @@
             potemplate=self.potemplate,
             context=u'EMAIL OF TRANSLATORS',
             singular=u'Your emails')
-        upstream_credits = kde_emails_potmsgset.setCurrentTranslation(
+        kde_emails_potmsgset.setCurrentTranslation(
             self.pofile, translator,
             {0: 'translator@upstream'},
             RosettaTranslationOrigin.SCM, share_with_other_side=True)
@@ -2050,7 +2075,7 @@
         # Create a message set from the test data.
         potmsgset = self.factory.makePOTMsgSet(
             self.potemplate, testmsg['msgid'], sequence=testmsg['sequence'])
-        translation = self.factory.makeCurrentTranslationMessage(
+        self.factory.makeCurrentTranslationMessage(
             self.pofile, potmsgset=potmsgset, translator=self.pofile.owner,
             translations={0: testmsg['string'], }, current_other=True)
 
@@ -2069,7 +2094,7 @@
         # that are for obsolete messages.
         potmsgset = self.factory.makePOTMsgSet(self.potemplate, sequence=0)
         text = self.factory.getUniqueString()
-        translation = self.factory.makeCurrentTranslationMessage(
+        self.factory.makeCurrentTranslationMessage(
             pofile=self.pofile, potmsgset=potmsgset, translations=[text])
 
         rows = list(self.pofile.getTranslationRows())
@@ -2092,7 +2117,7 @@
         self.pofile = self.factory.makePOFile(potemplate=self.potemplate)
         potmsgset = self.factory.makePOTMsgSet(self.potemplate, sequence=0)
         text = self.factory.getUniqueString()
-        translation = self.factory.makeCurrentTranslationMessage(
+        self.factory.makeCurrentTranslationMessage(
             pofile=self.pofile, potmsgset=potmsgset, translations=[text])
 
         rows = list(self.pofile.getTranslationRows())
@@ -2755,7 +2780,7 @@
         # A current translation message counts towards the POFile's
         # translatedCount.
         pofile = self.makePOFile()
-        suggestion = self.factory.makeCurrentTranslationMessage(pofile=pofile)
+        self.factory.makeCurrentTranslationMessage(pofile=pofile)
         self.exerciseFunction(pofile)
         self.assertEqual(1, self.getTranslatedCount(pofile))
 
@@ -2779,7 +2804,7 @@
     def test_translatedCount_diverged(self):
         # Diverged translations are also counted.
         pofile = self.makePOFile()
-        diverged = self.factory.makeDivergedTranslationMessage(pofile=pofile)
+        self.factory.makeDivergedTranslationMessage(pofile=pofile)
         self.exerciseFunction(pofile)
         self.assertEqual(1, self.getTranslatedCount(pofile))
 
@@ -2817,7 +2842,7 @@
         # Translations of obsolete POTMsgSets do not count as
         # untranslated.
         pofile = self.makePOFile()
-        potmsgset = self.factory.makePOTMsgSet(pofile.potemplate, sequence=0)
+        self.factory.makePOTMsgSet(pofile.potemplate, sequence=0)
         self.exerciseFunction(pofile)
         self.assertEqual(0, self.getUntranslatedCount(pofile))
 
@@ -2832,7 +2857,7 @@
     def test_untranslatedCount_diverged(self):
         # Diverged translations are also counted.
         pofile = self.makePOFile()
-        diverged = self.factory.makeDivergedTranslationMessage(pofile=pofile)
+        self.factory.makeDivergedTranslationMessage(pofile=pofile)
         self.exerciseFunction(pofile)
         self.assertEqual(0, self.getUntranslatedCount(pofile))
 
@@ -2891,7 +2916,7 @@
         pofile = self.makePOFile()
         potmsgset = self.factory.makePOTMsgSet(pofile.potemplate)
         self._makeOtherSideTranslation(pofile, potmsgset=potmsgset)
-        this_translation = self.factory.makeCurrentTranslationMessage(
+        self.factory.makeCurrentTranslationMessage(
             pofile=pofile, potmsgset=potmsgset)
         self.exerciseFunction(pofile)
         self.assertEqual(0, self.getCurrentCount(pofile))
@@ -3061,9 +3086,9 @@
         # newer than the review date on the current translation.
         pofile = self.makePOFile()
         potmsgset = self.factory.makePOTMsgSet(pofile.potemplate)
-        suggestion = self.factory.makeSuggestion(
+        self.factory.makeSuggestion(
             pofile=pofile, potmsgset=potmsgset)
-        translation = self.factory.makeCurrentTranslationMessage(
+        self.factory.makeCurrentTranslationMessage(
             pofile=pofile, potmsgset=potmsgset)
         self.exerciseFunction(pofile)
         self.assertEqual(0, self.getUnreviewedCount(pofile))
@@ -3076,7 +3101,7 @@
         translation = self.factory.makeCurrentTranslationMessage(
             pofile=pofile, potmsgset=potmsgset)
         translation.date_reviewed -= timedelta(1)
-        suggestion = self.factory.makeSuggestion(
+        self.factory.makeSuggestion(
             pofile=pofile, potmsgset=potmsgset)
         self.exerciseFunction(pofile)
         self.assertEqual(1, self.getUnreviewedCount(pofile))
@@ -3090,8 +3115,7 @@
         this_translation = self.factory.makeCurrentTranslationMessage(
             pofile=pofile, potmsgset=potmsgset)
         this_translation.date_reviewed -= timedelta(1)
-        other_translation = self._makeOtherSideTranslation(
-            pofile, potmsgset=potmsgset)
+        self._makeOtherSideTranslation(pofile, potmsgset=potmsgset)
         self.exerciseFunction(pofile)
         self.assertEqual(1, self.getUnreviewedCount(pofile))
 


Follow ups