launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #14113
[Merge] lp:~sinzui/launchpad/tranlationmessage-missing-potmsgset into lp:launchpad
Curtis Hovey has proposed merging lp:~sinzui/launchpad/tranlationmessage-missing-potmsgset into lp:launchpad.
Commit message:
Avoid putting UnusedPOTMsgSetPruner in a cage match against other processes.
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
Related bugs:
Bug #824229 in Launchpad itself: "garbo-daily UnusedPOTMsgSetPruner failing with violates foreign key constraint"
https://bugs.launchpad.net/launchpad/+bug/824229
For more details, see:
https://code.launchpad.net/~sinzui/launchpad/tranlationmessage-missing-potmsgset/+merge/133532
garbo-daily UnusedPOTMsgSetPruner contends with other process that
create or update POTMsgSets and TranslationMessages, such as the
rosetta-poimport script and user activity. When UnusedPOTMsgSetPruner
starts, it builds a cache of Ids of empty POTMsgSets. It then iterates
over them in batches to get the TranslationMessages id, removes them,
then removes the POTMsgSets. Each batch is one transaction. Any process
that adds or modifies a POTMsgSets or TranslationMessages between loops
can contradict the list of unused POTMsgSets.
--------------------------------------------------------------------
RULES
Pre-implementation: jtv, stevenk
* UnusedPOTMsgSetPruner needs additional criteria to be certain that
the POTMsgSet is not used, *and* will not be used.
* Each batch needs to recheck the list of ids to filter of those
that became "used" during the loop.
* UnusedPOTMsgSetPruner's list of POTMsgSets ids are not distinct
because both queries and their union can return 50x the number of
unique ids.
* This is mostly lots of extra db work.
QA
* Ask a webops to run ./cronscripts/garbo-daily.py for qastaging.
* Verify the log has no errors.
LINT
lib/lp/scripts/garbo.py
lib/lp/scripts/tests/test_garbo.py
LoC
I have a 4000 line credit this week.
TEST
./bin/test -vv -t UnusedPOTMsgSetPruner lp.scripts.tests.test_garbo
IMPLEMENTATION
I first changed .msgset_ids_to_remove to call a private method that
does the same query. I then changed the ._get_msgset_ids_to_remove() to
accept a list of ids to ensure newly used POTMsgSets were filtered out.
Then I discovered errors in the number of loops...the list of ids was
never unique and my change extended the number of loops. I added a set()
to dedupe the ids -- the existing tests got faster since they thought
they were looping once.
lib/lp/scripts/garbo.py
lib/lp/scripts/tests/test_garbo.py
--
https://code.launchpad.net/~sinzui/launchpad/tranlationmessage-missing-potmsgset/+merge/133532
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~sinzui/launchpad/tranlationmessage-missing-potmsgset into lp:launchpad.
=== modified file 'lib/lp/scripts/garbo.py'
--- lib/lp/scripts/garbo.py 2012-11-05 11:32:27 +0000
+++ lib/lp/scripts/garbo.py 2012-11-08 17:51:26 +0000
@@ -1224,14 +1224,32 @@
@cachedproperty
def msgset_ids_to_remove(self):
- """Return the IDs of the POTMsgSets to remove."""
+ """The IDs of the POTMsgSets to remove."""
+ return self._get_msgset_ids_to_remove()
+
+ def _get_msgset_ids_to_remove(self, ids=None):
+ """Return a distrinct list IDs of the POTMsgSets to remove.
+
+ :param ids: a list of POTMsgSet ids to filter. If ids is None,
+ all unused POTMsgSet in the database are returned.
+ """
+ if ids is None:
+ constraints = dict(
+ tti_constraint="AND TRUE",
+ potmsgset_constraint="AND TRUE")
+ else:
+ ids_in = ', '.join([str(id) for id in ids])
+ constraints = dict(
+ tti_constraint="AND tti.potmsgset IN (%s)" % ids_in,
+ potmsgset_constraint="AND POTMsgSet.id IN (%s)" % ids_in)
query = """
-- Get all POTMsgSet IDs which are obsolete (sequence == 0)
-- and are not used (sequence != 0) in any other template.
SELECT POTMsgSet
FROM TranslationTemplateItem tti
- WHERE sequence=0 AND
- NOT EXISTS(
+ WHERE sequence=0
+ %(tti_constraint)s
+ AND NOT EXISTS(
SELECT id
FROM TranslationTemplateItem
WHERE potmsgset = tti.potmsgset AND sequence != 0)
@@ -1243,20 +1261,22 @@
LEFT OUTER JOIN TranslationTemplateItem
ON TranslationTemplateItem.potmsgset = POTMsgSet.id
WHERE
- TranslationTemplateItem.potmsgset IS NULL);
- """
+ TranslationTemplateItem.potmsgset IS NULL
+ %(potmsgset_constraint)s);
+ """ % constraints
store = IMasterStore(POTMsgSet)
results = store.execute(query)
- ids_to_remove = [id for (id,) in results.get_all()]
- return ids_to_remove
+ ids_to_remove = set([id for (id,) in results.get_all()])
+ return list(ids_to_remove)
def __call__(self, chunk_size):
"""See `TunableLoop`."""
# We cast chunk_size to an int to avoid issues with slicing
# (DBLoopTuner passes in a float).
chunk_size = int(chunk_size)
- msgset_ids_to_remove = (
+ msgset_ids = (
self.msgset_ids_to_remove[self.offset:][:chunk_size])
+ msgset_ids_to_remove = self._get_msgset_ids_to_remove(msgset_ids)
# Remove related TranslationTemplateItems.
store = IMasterStore(POTMsgSet)
related_ttis = store.find(
=== modified file 'lib/lp/scripts/tests/test_garbo.py'
--- lib/lp/scripts/tests/test_garbo.py 2012-11-07 12:38:19 +0000
+++ lib/lp/scripts/tests/test_garbo.py 2012-11-08 17:51:26 +0000
@@ -73,7 +73,11 @@
load_garbo_job_state,
LoginTokenPruner,
OpenIDConsumerAssociationPruner,
+<<<<<<< TREE
save_garbo_job_state,
+=======
+ UnusedPOTMsgSetPruner,
+>>>>>>> MERGE-SOURCE
UnusedSessionPruner,
)
from lp.services.config import config
@@ -127,6 +131,7 @@
LaunchpadZopelessLayer,
ZopelessDatabaseLayer,
)
+from lp.translations.model.pofile import POFile
from lp.translations.model.potmsgset import POTMsgSet
from lp.translations.model.translationtemplateitem import (
TranslationTemplateItem,
@@ -1034,6 +1039,44 @@
self.runDaily()
self.assertEqual(0, obsolete_msgsets.count())
+ def test_UnusedPOTMsgSetPruner_preserves_used_potmsgsets(self):
+ # UnusedPOTMsgSetPruner will not remove a potmsgset if it changes
+ # between calls.
+ switch_dbuser('testadmin')
+ potmsgset_pofile = {}
+ for n in xrange(4):
+ pofile = self.factory.makePOFile()
+ translation_message = self.factory.makeCurrentTranslationMessage(
+ pofile=pofile)
+ translation_message.potmsgset.setSequence(
+ pofile.potemplate, 0)
+ potmsgset_pofile[translation_message.potmsgset.id] = pofile.id
+ transaction.commit()
+ store = IMasterStore(POTMsgSet)
+ test_ids = potmsgset_pofile.keys()
+ obsolete_msgsets = store.find(
+ POTMsgSet,
+ In(TranslationTemplateItem.potmsgsetID, test_ids),
+ TranslationTemplateItem.sequence == 0)
+ self.assertEqual(4, obsolete_msgsets.count())
+ pruner = UnusedPOTMsgSetPruner(self.log)
+ pruner(2)
+ # A potmsgeset is set to a sequence > 0 between batches/commits.
+ last_id = pruner.msgset_ids_to_remove[-1]
+ used_potmsgset = store.find(POTMsgSet, POTMsgSet.id == last_id).one()
+ used_pofile = store.find(
+ POFile, POFile.id == potmsgset_pofile[last_id]).one()
+ translation_message = self.factory.makeCurrentTranslationMessage(
+ pofile=used_pofile, potmsgset=used_potmsgset)
+ used_potmsgset.setSequence(used_pofile.potemplate, 1)
+ transaction.commit()
+ # Next batch.
+ pruner(2)
+ self.assertEqual(0, obsolete_msgsets.count())
+ preserved_msgsets = store.find(
+ POTMsgSet, In(TranslationTemplateItem.potmsgsetID, test_ids))
+ self.assertEqual(1, preserved_msgsets.count())
+
def test_UnusedPOTMsgSetPruner_removes_unreferenced_message_sets(self):
# If a POTMsgSet is not referenced by any templates the
# UnusedPOTMsgSetPruner will remove it.
Follow ups