← Back to team overview

apport-hackers team mailing list archive

[Merge] lp:~brian-murray/apport/zgrep-fallback into lp:apport

 

Brian Murray has proposed merging lp:~brian-murray/apport/zgrep-fallback into lp:apport.

Requested reviews:
  Apport upstream developers (apport-hackers)

For more details, see:
https://code.launchpad.net/~brian-murray/apport/zgrep-fallback/+merge/310218

The production retracers for the Error Tracker were OOM'ing regularly when trying to use zgrep to search Contents.gz for files found in the crash report.  While zgrep is faster than using gzip and reading the file line by line this still seems like a good fallback option and is better than having the retrace process crash, I've implemented the proposed change in the production version of the Error Tracker and have encountered no issues with it.

As mentioned this could likely be better:

+                        try:
+                            line = line.decode('UTF-8').rstrip('\n')
+                        # 2016-11-01 this should be better
+                        except UnicodeDecodeError:
+                            continue

I added because of the following lines in Contents.gz for yakkety:

 $ zgrep -a "lenska.alias" /mnt/storage/archive-mirror/dists/yakkety/Contents-amd64.gz
usr/lib/aspell/�slenska.alias                               universe/text/aspell-is

Thanks!
-- 
Your team Apport upstream developers is requested to review the proposed merge of lp:~brian-murray/apport/zgrep-fallback into lp:apport.
=== modified file 'backends/packaging-apt-dpkg.py'
--- backends/packaging-apt-dpkg.py	2016-08-13 07:09:38 +0000
+++ backends/packaging-apt-dpkg.py	2016-11-07 18:30:14 +0000
@@ -13,6 +13,7 @@
 # the full text of the license.
 
 import subprocess, os, glob, stat, sys, tempfile, shutil, time
+import errno
 import hashlib
 import json
 
@@ -1221,9 +1222,25 @@
 
             # zgrep is magnitudes faster than a 'gzip.open/split() loop'
             package = None
-            zgrep = subprocess.Popen(['zgrep', '-m1', '^%s[[:space:]]' % file, map],
-                                     stdout=subprocess.PIPE, stderr=subprocess.PIPE)
-            out = zgrep.communicate()[0].decode('UTF-8')
+            try:
+                zgrep = subprocess.Popen(['zgrep', '-m1', '^%s[[:space:]]' % file, map],
+                                         stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+                out = zgrep.communicate()[0].decode('UTF-8')
+            except OSError as e:
+                if e.errno != errno.ENOMEM:
+                    raise
+                import gzip
+                with gzip.open('%s' % map, 'rb') as contents:
+                    out = ''
+                    for line in contents:
+                        try:
+                            line = line.decode('UTF-8').rstrip('\n')
+                        # 2016-11-01 this should be better
+                        except UnicodeDecodeError:
+                            continue
+                        if line.startswith(file):
+                            out = line
+                            break
             # we do not check the return code, since zgrep -m1 often errors out
             # with 'stdout: broken pipe'
             if out:


Follow ups