← Back to team overview

launchpad-reviewers team mailing list archive

lp:~michael.nelson/launchpad/982938-dont-load-all-tokens-for-all-ppas-into-memory into lp:launchpad

 

Michael Nelson has proposed merging lp:~michael.nelson/launchpad/982938-dont-load-all-tokens-for-all-ppas-into-memory into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #982938 in Launchpad itself: "generate-ppa-htaccess takes several minutes to run"
  https://bugs.launchpad.net/launchpad/+bug/982938

For more details, see:
https://code.launchpad.net/~michael.nelson/launchpad/982938-dont-load-all-tokens-for-all-ppas-into-memory/+merge/107980

Summary
=======
This branch aims to reduce the load/memory consumption on germanium while generating htpasswd files.

Proposed fix
============

Use a generator to return a storm result set of tokens for each PPA, rather than
loading all tokens for all PPAs into a dict.

It will result in a larger query count. jml is working on a script that will generate data to be able to verify whether this branch does indeed help.

Pre-implementation notes
========================
Referring to https://bugs.launchpad.net/launchpad/+bug/982938/comments/6
12:27 < noodles> jml: if you think the first of those points (in my comment) is very likely to be causing performance issues, I could start coding that straight away...
12:27 < noodles> (given the numbers achuni mentioned above, it would seem likely, but only the measurements will tell).
12:27 < jml> as you say, only measurements will tell :)
12:28 < jml> noodles: I reckon code it up straight away. :)

Implementation details
======================
Sheesh - this is the quick cover letter? :P

LOC Rationale
=============
Yay - 12 lines of credit.

Tests
=====
./bin/test -vvct TestPPAHtaccessTokenGeneration

Demo and Q/A
============
If this script runs on staging, we could QA by comparing output before and after running it.

Lint
====
Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/archivepublisher/scripts/generate_ppa_htaccess.py
-- 
https://code.launchpad.net/~michael.nelson/launchpad/982938-dont-load-all-tokens-for-all-ppas-into-memory/+merge/107980
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~michael.nelson/launchpad/982938-dont-load-all-tokens-for-all-ppas-into-memory into lp:launchpad.
=== modified file 'lib/lp/archivepublisher/scripts/generate_ppa_htaccess.py'
--- lib/lp/archivepublisher/scripts/generate_ppa_htaccess.py	2012-05-10 21:44:21 +0000
+++ lib/lp/archivepublisher/scripts/generate_ppa_htaccess.py	2012-05-30 14:23:18 +0000
@@ -249,21 +249,14 @@
         return new_ppa_tokens
 
     def _getValidTokensForPPAs(self, ppas):
-        """Returns a dict keyed by archive with all the tokens for each."""
-        affected_ppas_with_tokens = dict([
-            (ppa, []) for ppa in ppas])
-
+        """Yields (ppa, tokens_result_set) tuples for each PPA."""
         store = IStore(ArchiveAuthToken)
-        affected_ppa_tokens = store.find(
-            ArchiveAuthToken,
-            ArchiveAuthToken.date_deactivated == None,
-            ArchiveAuthToken.archive_id.is_in(
-                [ppa.id for ppa in ppas]))
-
-        for token in affected_ppa_tokens:
-            affected_ppas_with_tokens[token.archive].append(token)
-
-        return affected_ppas_with_tokens
+        for ppa in ppas:
+            ppa_tokens = store.find(
+                ArchiveAuthToken,
+                ArchiveAuthToken.date_deactivated == None,
+                ArchiveAuthToken.archive_id == ppa.id)
+            yield (ppa, ppa_tokens)
 
     def getNewPrivatePPAs(self):
         """Return the recently created private PPAs."""
@@ -296,12 +289,7 @@
 
         affected_ppas.update(self.getNewPrivatePPAs())
 
-        affected_ppas_with_tokens = {}
-        if affected_ppas:
-            affected_ppas_with_tokens = self._getValidTokensForPPAs(
-                affected_ppas)
-
-        for ppa, valid_tokens in affected_ppas_with_tokens.iteritems():
+        for ppa, valid_tokens in self._getValidTokensForPPAs(affected_ppas):
             # If this PPA is blacklisted, do not touch it's htaccess/pwd
             # files.
             blacklisted_ppa_names_for_owner = self.blacklist.get(


Follow ups