← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~wgrant/launchpad/gzip-n into lp:launchpad

 

William Grant has proposed merging lp:~wgrant/launchpad/gzip-n into lp:launchpad.

Commit message:
Fix RepositoryIndexFile to gzip without timestamps.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~wgrant/launchpad/gzip-n/+merge/307500

Fix RepositoryIndexFile to gzip without timestamps.

Avoids polluting by-hash directories with dozens of identical gzipped files
when index content doesn't otherwise change. Also prevents some needless hash
sum mismatch errors from apt.

See eg. the 30ish 544B gzips in http://ppa.launchpad.net/varlesh-l/test/ubuntu/dists/xenial/main/source/by-hash/SHA256/.

bzip2 and xz don't store mtime, so aren't affected.
-- 
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~wgrant/launchpad/gzip-n into lp:launchpad.
=== modified file 'lib/lp/archivepublisher/tests/test_repositoryindexfile.py'
--- lib/lp/archivepublisher/tests/test_repositoryindexfile.py	2016-02-05 20:28:29 +0000
+++ lib/lp/archivepublisher/tests/test_repositoryindexfile.py	2016-10-04 01:28:49 +0000
@@ -99,7 +99,8 @@
         repo_file.write('hello')
         repo_file.close()
 
-        gzip_content = gzip.open(os.path.join(self.root, 'boing.gz')).read()
+        gzip_file = gzip.open(os.path.join(self.root, 'boing.gz'))
+        gzip_content = gzip_file.read()
         bz2_content = bz2.decompress(
             open(os.path.join(self.root, 'boing.bz2')).read())
         xz_content = lzma.open(os.path.join(self.root, 'boing.xz')).read()
@@ -108,6 +109,12 @@
         self.assertEqual(gzip_content, xz_content)
         self.assertEqual('hello', gzip_content)
 
+        # gzip is compressed as if with "-n", ensuring that the hash
+        # doesn't change just because we're compressing at a different
+        # point in time. The filename is also blank, but Python's gzip
+        # module discards it so it's hard to test.
+        self.assertEqual(0, gzip_file.mtime)
+
     def testCompressors(self):
         """`RepositoryIndexFile` honours the supplied list of compressors."""
         repo_file = self.getRepoFile(

=== modified file 'lib/lp/archivepublisher/utils.py'
--- lib/lp/archivepublisher/utils.py	2016-06-06 15:38:54 +0000
+++ lib/lp/archivepublisher/utils.py	2016-10-04 01:28:49 +0000
@@ -85,7 +85,10 @@
     suffix = '.gz'
 
     def _buildFile(self, fd):
-        return gzip.GzipFile(fileobj=os.fdopen(fd, "wb"))
+        # Blank the filename and mtime as if using "gzip -n" to avoid
+        # needless hash changes.
+        return gzip.GzipFile(
+            fileobj=os.fdopen(fd, "wb"), filename='', mtime=0)
 
 
 class Bzip2TempFile(PlainTempFile):


Follow ups