← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~stub/launchpad/trivial into lp:launchpad

 

Stuart Bishop has proposed merging lp:~stub/launchpad/trivial into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #951401 in Launchpad itself: "parse-ppa-apache-logs failing (missing files)"
  https://bugs.launchpad.net/launchpad/+bug/951401

For more details, see:
https://code.launchpad.net/~stub/launchpad/trivial/+merge/96983

= Summary =

The PPA Apache Log Parser is spitting out spurious errors, per Bug #951401.

== Proposed fix ==

Instead of deferring checks to see if a file has been previously parsed or not, do the check on startup. This addresses the worst case where we go to check a file hours after the script has started, only to find that the file has been removed from disk. There is still a small race condition between expanding the glob and grabbing the first line of the log file, but I didn't worry about that. We might even want to know about this as it indicates odd scheduling.


== Pre-implementation notes ==

== Implementation details ==

== Tests ==

== Demo and Q/A ==


= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/services/apachelogparser/script.py
-- 
https://code.launchpad.net/~stub/launchpad/trivial/+merge/96983
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~stub/launchpad/trivial into lp:launchpad.
=== modified file 'lib/lp/services/apachelogparser/script.py'
--- lib/lp/services/apachelogparser/script.py	2012-01-01 02:58:52 +0000
+++ lib/lp/services/apachelogparser/script.py	2012-03-12 09:20:46 +0000
@@ -67,10 +67,18 @@
         raise NotImplementedError
 
     def main(self):
-        files_to_parse = get_files_to_parse(
-            glob.glob(os.path.join(self.root, self.log_file_glob)))
-
         self.setUpUtilities()
+
+        # Materialize the list of files to parse. It is better to do the
+        # checks now, rather than potentially hours later when the
+        # generator gets around to it, because there is a reasonable
+        # chance log rotation will have kicked in and removed our oldest
+        # files. Note that we still error if a file we want to parse
+        # disappears before we get around to parsing it, which is
+        # desirable behavior.
+        files_to_parse = list(get_files_to_parse(
+            glob.glob(os.path.join(self.root, self.log_file_glob))))
+
         country_set = getUtility(ICountrySet)
         parsed_lines = 0
         max_parsed_lines = getattr(