launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #29882
[Merge] ~cjwatson/launchpad:removed-archive-files-not-reapable into launchpad:master
Colin Watson has proposed merging ~cjwatson/launchpad:removed-archive-files-not-reapable into launchpad:master.
Commit message:
Don't consider removed ArchiveFiles as eligible for reaping
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
For more details, see:
https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/440415
Commit 559bbefbb5d7327debd5343c1e812787f5d0caad introduced a gradual performance regression for PPA publishing. In order to support serving old snapshots, `ArchiveFile` rows now have their `date_removed` column set rather than being deleted from the database. However, I failed to account for this in the code that decides which PPAs to publish due to pending deletions and which suites in each PPA should have `by-hash` files pruned from them. As a result, the PPA publisher has been getting gradually slower as we accumulate more removed `ArchiveFile` rows.
Excluding removed rows makes the query for pending-publication PPAs substantially faster since it has an order of magnitude fewer `ArchiveFile` rows to sort, and it should mean that we spend less effort publishing the same PPAs over and over again.
--
Your team Launchpad code reviewers is requested to review the proposed merge of ~cjwatson/launchpad:removed-archive-files-not-reapable into launchpad:master.
diff --git a/lib/lp/archivepublisher/tests/test_publisher.py b/lib/lp/archivepublisher/tests/test_publisher.py
index c7bbea7..ab3fd82 100644
--- a/lib/lp/archivepublisher/tests/test_publisher.py
+++ b/lib/lp/archivepublisher/tests/test_publisher.py
@@ -1499,6 +1499,8 @@ class TestPublisher(TestPublisherBase):
archive_file
).scheduled_deletion_date = now - timedelta(hours=12)
self.assertIn(archive, ubuntu.getPendingPublicationPPAs())
+ getUtility(IArchiveFileSet).markDeleted([archive_file])
+ self.assertNotIn(archive, ubuntu.getPendingPublicationPPAs())
def testDirtySuitesArchive(self):
# getPendingPublicationPPAs returns archives that have dirty_suites
diff --git a/lib/lp/registry/model/distribution.py b/lib/lp/registry/model/distribution.py
index 36f3543..e02fc3d 100644
--- a/lib/lp/registry/model/distribution.py
+++ b/lib/lp/registry/model/distribution.py
@@ -1969,6 +1969,7 @@ class Distribution(
Archive.distribution == self,
ArchiveFile.archive == Archive.id,
ArchiveFile.scheduled_deletion_date < UTC_NOW,
+ ArchiveFile.date_removed == None,
)
.order_by(Archive.id)
.config(distinct=True)
diff --git a/lib/lp/soyuz/model/archivefile.py b/lib/lp/soyuz/model/archivefile.py
index 89576a4..f9bbe36 100644
--- a/lib/lp/soyuz/model/archivefile.py
+++ b/lib/lp/soyuz/model/archivefile.py
@@ -215,6 +215,7 @@ class ArchiveFileSet:
clauses = [
ArchiveFile.archive == archive,
ArchiveFile.scheduled_deletion_date < _now(),
+ ArchiveFile.date_removed == None,
]
if container_prefix is not None:
clauses.append(ArchiveFile.container.startswith(container_prefix))
diff --git a/lib/lp/soyuz/tests/test_archivefile.py b/lib/lp/soyuz/tests/test_archivefile.py
index 13a89f3..85d006f 100644
--- a/lib/lp/soyuz/tests/test_archivefile.py
+++ b/lib/lp/soyuz/tests/test_archivefile.py
@@ -351,6 +351,10 @@ class TestArchiveFile(TestCaseWithFactory):
archive, container_prefix="release:"
),
)
+ archive_file_set.markDeleted([archive_files[3]])
+ self.assertContentEqual(
+ ["release:foo"], archive_file_set.getContainersToReap(archive)
+ )
def test_markDeleted(self):
archive = self.factory.makeArchive()