← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~cjwatson/launchpad/close-bugs-invalid-utf8 into lp:launchpad

 

Colin Watson has proposed merging lp:~cjwatson/launchpad/close-bugs-invalid-utf8 into lp:launchpad.

Commit message:
Make SPR.aggregate_changelog cope with invalid UTF-8.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)
Related bugs:
  Bug #1083131 in Launchpad itself: "Closing bugs OOPSes on invalid UTF-8"
  https://bugs.launchpad.net/launchpad/+bug/1083131

For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/close-bugs-invalid-utf8/+merge/136164

== Summary ==

Bug 1083181: SPR.aggregate_changelog OOPSes on invalid UTF-8 in changelogs.  This is not permitted by Debian policy but still occasionally happens due to misconfigured editors and the like; we don't have to go to any particular effort to display it nicely, but it shouldn't cause an OOPS.

== Proposed fix ==

Manually decode the changelog contents using errors="replace".  The only parts of the changelog that are semantically significant are guaranteed to be ASCII, so this should be safe.

== LOC Rationale ==

+23.  I have 7000 or so lines of credit.

== Tests ==

bin/test -vvct TestSourcePackageRelease.test_aggregate_changelog

== Demo and Q/A ==

Reupload gworkspace 0.8.8-1.1build1 to raring-proposed on dogfood and then attempt to copy it to the raring release pocket.

== Lint ==

Just a false positive on a property setter.
-- 
https://code.launchpad.net/~cjwatson/launchpad/close-bugs-invalid-utf8/+merge/136164
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~cjwatson/launchpad/close-bugs-invalid-utf8 into lp:launchpad.
=== modified file 'lib/lp/soyuz/model/sourcepackagerelease.py'
--- lib/lp/soyuz/model/sourcepackagerelease.py	2012-11-15 01:41:14 +0000
+++ lib/lp/soyuz/model/sourcepackagerelease.py	2012-11-26 13:09:27 +0000
@@ -603,7 +603,8 @@
         # only useful way of extracting info is to use the iterator on
         # Changelog and then compare versions.
         try:
-            for block in Changelog(changelog.read()):
+            changelog_text = changelog.read().decode("UTF-8", "replace")
+            for block in Changelog(changelog_text):
                 version = block._raw_version
                 if (since_version and
                     apt_pkg.version_compare(version, since_version) <= 0):

=== modified file 'lib/lp/soyuz/tests/test_sourcepackagerelease.py'
--- lib/lp/soyuz/tests/test_sourcepackagerelease.py	2012-01-01 02:58:52 +0000
+++ lib/lp/soyuz/tests/test_sourcepackagerelease.py	2012-11-26 13:09:27 +0000
@@ -1,3 +1,7 @@
+# -*- coding: utf-8 -*-
+# NOTE: The first line above must stay first; do not move the copyright
+# notice to the top.  See http://www.python.org/dev/peps/pep-0263/.
+#
 # Copyright 2010-2011 Canonical Ltd.  This software is licensed under the
 # GNU Affero General Public License version 3 (see the file LICENSE).
 
@@ -110,6 +114,24 @@
             since_version="1.0")
         self.assertEqual(expected_changelog, observed)
 
+    def test_aggregate_changelog_invalid_utf8(self):
+        # aggregate_changelog copes with invalid UTF-8.
+        changelog_main = dedent(u"""\
+            foo (1.0) unstable; urgency=low
+
+              * 1.0 (héllo).""").encode("ISO-8859-1")
+        changelog_trailer = (
+            u" -- Føo Bær <foo@xxxxxxxxxxx>  "
+            u"Tue, 01 Jan 1970 01:50:41 +0000").encode("ISO-8859-1")
+        changelog_text = changelog_main + b"\n\n" + changelog_trailer
+        changelog = self.factory.makeLibraryFileAlias(content=changelog_text)
+        spph = self.factory.makeSourcePackagePublishingHistory(
+            sourcepackagename="foo", version="1.0", changelog=changelog)
+        transaction.commit()
+        observed = spph.sourcepackagerelease.aggregate_changelog(
+            since_version=None)
+        self.assertEqual(changelog_main.decode("UTF-8", "replace"), observed)
+
 
 class TestGetActiveArchSpecificPublications(TestCaseWithFactory):
 


Follow ups