launchpad-reviewers team mailing list archive
  
  - 
     launchpad-reviewers team launchpad-reviewers team
- 
    Mailing list archive
  
- 
    Message #02326
  
 [Merge] lp:~jtv/launchpad/bug-702228 into	lp:launchpad
  
Jeroen T. Vermeulen has proposed merging lp:~jtv/launchpad/bug-702228 into lp:launchpad.
Requested reviews:
  Launchpad code reviewers (launchpad-reviewers): code
Related bugs:
  #702228 rosetta export queue appears stuck, and not progressing
  https://bugs.launchpad.net/bugs/702228
For more details, see:
https://code.launchpad.net/~jtv/launchpad/bug-702228/+merge/46165
= Bug 702228 =
A particularly difficult translations export request was breaking the export queue.  It took over 3 hours to run, at which point our watchdog script that looks for long transactions on the master database killed its master store connection.
This branch addresses that by avoiding use of the master database, so that the connection will not start a transaction until the very end of a request (where a master connection is really needed).  An occasional long transaction on the slave is not such a big problem, and certainly won't take any write locks.  So the watchdog will leave the slave connection in peace.
There is a nesting of database policies: the SlaveDatabasePolicy makes the slave store the default, so that e.g. some use of cursor() that we still have will use the slave.  This is merely a little help in reducing accidental use of the master store.  But I also bracketed the time-critical part in a SlaveOnlyDatabasePolicy.  This outright disallows use of the master store.
It's not quite obvious, but the method that fetches the next request off the queue accesses the master store.  It doesn't make any changes there, but it does verify that the oldest fully-replicated request it gets off the slave store has not already been deleted from the master store.  That situation could otherwise occur if the script deleted the previous request from the master store, then again tried to fetch the oldest request off the slave before the deletion had replicated.
I tested thusly:
{{{
./bin/test -vvc lp.translations -t export
}}}
To Q/A, first ensure that regular exports still work (both in Ubuntu and upstream; include XPI) by running the script manually, and seeing that it still produces valid tarballs and individual files.  On staging that requires manually clearing the backlog first.
Watch pg_stat_activity on the master database during a large export.  The export script's master backend process (as logged by the script) should be "<IDLE>" until the end, not "<IDLE> in transaction."
No lint,
Jeroen
-- 
https://code.launchpad.net/~jtv/launchpad/bug-702228/+merge/46165
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jtv/launchpad/bug-702228 into lp:launchpad.
=== modified file 'cronscripts/rosetta-export-queue.py'
--- cronscripts/rosetta-export-queue.py	2010-04-27 19:48:39 +0000
+++ cronscripts/rosetta-export-queue.py	2011-01-13 18:25:08 +0000
@@ -8,17 +8,20 @@
 import _pythonpath
 
 from canonical.database.sqlbase import ISOLATION_LEVEL_READ_COMMITTED
+from canonical.launchpad.webapp.dbpolicy import SlaveDatabasePolicy
 from lp.translations.scripts.po_export_queue import process_queue
 from lp.services.scripts.base import LaunchpadCronScript
 
 
 class RosettaExportQueue(LaunchpadCronScript):
+    """Translation exports."""
+
     def main(self):
         self.txn.set_isolation_level(ISOLATION_LEVEL_READ_COMMITTED)
-        process_queue(self.txn, self.logger)
+        with SlaveDatabasePolicy():
+            process_queue(self.txn, self.logger)
 
 
 if __name__ == '__main__':
     script = RosettaExportQueue('rosetta-export-queue', dbuser='poexport')
     script.lock_and_run()
-
=== modified file 'lib/lp/translations/doc/poexport-queue.txt'
--- lib/lp/translations/doc/poexport-queue.txt	2010-12-02 16:13:51 +0000
+++ lib/lp/translations/doc/poexport-queue.txt	2011-01-13 18:25:08 +0000
@@ -42,7 +42,7 @@
     ...     result.addFailure()
 
 In this case, there is an error, so there shouldn't be a URL to download
-anything, if we set it, the system will fail:
+anything.  If we set it, the system will fail:
 
     >>> result.url = 'http://someplace.com/somefile.tar.gz'
 
=== modified file 'lib/lp/translations/scripts/po_export_queue.py'
--- lib/lp/translations/scripts/po_export_queue.py	2010-08-20 20:31:18 +0000
+++ lib/lp/translations/scripts/po_export_queue.py	2011-01-13 18:25:08 +0000
@@ -23,6 +23,7 @@
 from canonical.launchpad.interfaces.librarian import ILibraryFileAliasSet
 from canonical.launchpad.mail import simple_sendmail
 from canonical.launchpad.webapp import canonical_url
+from canonical.launchpad.webapp.dbpolicy import SlaveOnlyDatabasePolicy
 from lp.registry.interfaces.productseries import IProductSeries
 from lp.registry.interfaces.sourcepackage import ISourcePackage
 from lp.translations.interfaces.poexportrequest import IPOExportRequestSet
@@ -54,6 +55,7 @@
         self.url = None
         self.failure = None
         self.logger = logger
+        self.exported_file = None
 
         self.requested_exports = list(requested_exports)
         export_requested_at = self._getExportRequestOrigin()
@@ -250,6 +252,46 @@
             'request_url': self.request_url,
             }
 
+    def setExportFile(self, exported_file):
+        """Attach an exported file to the result, for upload to the Librarian.
+
+        After this is set, `upload` will perform the actual upload.  The two
+        actions are separated so as to isolate write access to the database.
+
+        :param exported_file: An `IExportedTranslationFile` containing the
+            exported data.
+        """
+        self.exported_file = exported_file
+
+    def upload(self, logger=None):
+        """Upload exported file as set with `setExportFile` to the Librarian.
+
+        If no file has been set, do nothing.
+        """
+        if self.exported_file is None:
+            # There's nothing to upload.
+            return
+
+        if self.exported_file.path is None:
+            # The exported path is unknown, use translation domain as its
+            # filename.
+            assert self.exported_file.file_extension, (
+                'File extension must have a value!.')
+            path = 'launchpad-export.%s' % self.exported_file.file_extension
+        else:
+            # Convert the path to a single file name so it's noted in
+            # librarian.
+            path = self.exported_file.path.replace(os.sep, '_')
+
+        alias_set = getUtility(ILibraryFileAliasSet)
+        alias = alias_set.create(
+            name=path, size=self.exported_file.size, file=self.exported_file,
+            contentType=self.exported_file.content_type)
+
+        self.url = alias.http_url
+        if logger is not None:
+            logger.info("Stored file at %s" % self.url)
+
     def notify(self):
         """Send a notification email to the given person about the export.
 
@@ -331,6 +373,9 @@
     multiple files) and information about files that we failed to export (if
     any).
     """
+    # Keep as much work off the master store as possible, so we avoid
+    # opening unnecessary transactions there.  It could be a while
+    # before we get to the commit.
     translation_exporter = getUtility(ITranslationExporter)
     requested_objects = list(objects)
 
@@ -352,28 +397,9 @@
         # error, we add the entry to the list of errors.
         result.addFailure()
     else:
-        if exported_file.path is None:
-            # The exported path is unknown, use translation domain as its
-            # filename.
-            assert exported_file.file_extension, (
-                'File extension must have a value!.')
-            exported_path = 'launchpad-export.%s' % (
-                exported_file.file_extension)
-        else:
-            # Convert the path to a single file name so it's noted in
-            # librarian.
-            exported_path = exported_file.path.replace(os.sep, '_')
-
-        alias_set = getUtility(ILibraryFileAliasSet)
-        alias = alias_set.create(
-            name=exported_path,
-            size=exported_file.size,
-            file=exported_file,
-            contentType=exported_file.content_type)
-        result.url = alias.http_url
-        logger.info("Stored file at %s" % result.url)
-
-    result.notify()
+        result.setExportFile(exported_file)
+
+    return result
 
 
 def process_queue(transaction_manager, logger):
@@ -386,8 +412,19 @@
 
     request = request_set.getRequest()
     while request != no_request:
-        person, objects, format, request_ids = request
-        process_request(person, objects, format, logger)
+
+        # This can take a long time.  Make sure we don't open any
+        # transactions on the master store before we really need to.
+        transaction_manager.commit()
+        with SlaveOnlyDatabasePolicy():
+            person, objects, format, request_ids = request
+            result = process_request(person, objects, format, logger)
+
+        # Almost done.  Now we can go back to using the master database
+        # where needed.
+        result.upload(logger=logger)
+        result.notify()
+
         request_set.removeRequest(request_ids)
         transaction_manager.commit()
 
=== added file 'lib/lp/translations/tests/test_exportresult.py'
--- lib/lp/translations/tests/test_exportresult.py	1970-01-01 00:00:00 +0000
+++ lib/lp/translations/tests/test_exportresult.py	2011-01-13 18:25:08 +0000
@@ -0,0 +1,67 @@
+# Copyright 2010 Canonical Ltd.  This software is licensed under the
+# GNU Affero General Public License version 3 (see the file LICENSE).
+
+"""Tests for `ExportResult`."""
+
+__metaclass__ = type
+
+import hashlib
+import os.path
+from StringIO import StringIO
+
+from canonical.librarian.testing.fake import FakeLibrarian
+from canonical.testing.layers import ZopelessDatabaseLayer
+from lp.services.log.logger import DevNullLogger
+from lp.testing import TestCaseWithFactory
+from lp.translations.scripts.po_export_queue import ExportResult
+
+
+class FakeExportedTranslationFile:
+    """Fake `IExportedTranslationFile` for testing."""
+
+    def __init__(self, path, content, content_type):
+        self.path = path
+        base, ext = os.path.splitext(path)
+        self.file_extension = ext
+        self.size = len(content)
+        self.content = content
+        self.file = StringIO(content)
+        self.content_type = content_type
+
+    def read(self, *args, **kwargs):
+        return self.file.read(*args, **kwargs)
+
+
+class TestExportResult(TestCaseWithFactory):
+
+    layer = ZopelessDatabaseLayer
+
+    def makeExportResult(self):
+        request = [self.factory.makePOFile()]
+        requester = self.factory.makePerson()
+        logger = DevNullLogger()
+        return ExportResult(requester, request, logger)
+
+    def makeExportedTranslationFile(self):
+        filename = self.factory.getUniqueString()
+        content = self.factory.getUniqueString()
+        mime_type = "text/plain"
+        return FakeExportedTranslationFile(filename, content, mime_type)
+
+    def test_upload_exported_file(self):
+        librarian = self.useFixture(FakeLibrarian())
+        export = self.makeExportedTranslationFile()
+        export_result = self.makeExportResult()
+        export_result.setExportFile(export)
+        export_result.upload()
+
+        self.assertNotEqual(None, export_result.url)
+        sha1 = hashlib.sha1(export.content).hexdigest()
+        self.assertEqual(sha1, librarian.aliases.values()[0].content.sha1)
+        alias = librarian.findBySHA1(sha1)
+        self.assertEqual(export.path, alias.filename)
+
+    def test_upload_without_exported_file_does_nothing(self):
+        export_result = self.makeExportResult()
+        export_result.upload()
+        self.assertIs(None, export_result.url)