← Back to team overview

launchpad-reviewers team mailing list archive

Re: [Merge] ~cjwatson/launchpad:db-artifactory-publish into launchpad:db-devel

 

Review: Needs Fixing db



Diff comments:

> diff --git a/database/schema/patch-2210-44-0.sql b/database/schema/patch-2210-44-0.sql
> new file mode 100644
> index 0000000..bf5e78c
> --- /dev/null
> +++ b/database/schema/patch-2210-44-0.sql
> @@ -0,0 +1,106 @@
> +-- Copyright 2022 Canonical Ltd.  This software is licensed under the
> +-- GNU Affero General Public License version 3 (see the file LICENSE).
> +
> +SET client_min_messages=ERROR;
> +
> +-- STEP 1, COLD
> +
> +ALTER TABLE Archive
> +    -- 0 == LOCAL
> +    ADD COLUMN publishing_method integer DEFAULT 0,
> +    -- 0 == DEBIAN
> +    ADD COLUMN index_format integer DEFAULT 0;

It's not just the index format that differs, since e.g. the pool structure is rather apt-specific. repository_format or repository_type instead?

> +
> +ALTER TABLE SourcePackageRelease
> +    ADD COLUMN ci_build integer REFERENCES CIBuild,
> +    ADD CONSTRAINT at_most_one_build CHECK (
> +        null_count(ARRAY[sourcepackage_recipe_build, ci_build]) >= 1),

This will likely take approximately a very long time.

> +    ALTER COLUMN component DROP NOT NULL,
> +    ALTER COLUMN section DROP NOT NULL,
> +    ALTER COLUMN urgency DROP NOT NULL,
> +    ALTER COLUMN dsc_format DROP NOT NULL,
> +    ADD CONSTRAINT debian_columns CHECK (
> +        -- 1 == DPKG
> +        format != 1
> +        OR (component IS NOT NULL
> +            AND section IS NOT NULL
> +            AND urgency IS NOT NULL
> +            AND dsc_format IS NOT NULL)),

Also roughly eternal.

> +    -- This is always non-NULL for Debian-format source packages, but it's
> +    -- not particularly important to constrain this at the DB level.
> +    ALTER COLUMN maintainer DROP NOT NULL;
> +
> +ALTER TABLE SourcePackagePublishingHistory
> +    ADD COLUMN format integer,
> +    ALTER COLUMN component DROP NOT NULL,
> +    ALTER COLUMN section DROP NOT NULL,
> +    ADD CONSTRAINT debian_columns CHECK (
> +        (format IS NOT NULL
> +            -- 1 == DPKG
> +            AND format != 1)

COALSECE(format, 1) != 1?

> +        OR (component IS NOT NULL
> +            AND section IS NOT NULL)),
> +    ADD COLUMN channel jsonb;

Should this also be constrained to prevent channel from being set for format==DPKG?

Also likely slow.

> +
> +ALTER TABLE BinaryPackageRelease
> +    ADD COLUMN ci_build integer REFERENCES CIBuild,
> +    ALTER COLUMN build DROP NOT NULL,
> +    ADD CONSTRAINT one_build CHECK (null_count(ARRAY[build, ci_build]) = 1),
> +    ALTER COLUMN component DROP NOT NULL,
> +    ALTER COLUMN section DROP NOT NULL,
> +    ALTER COLUMN priority DROP NOT NULL,
> +    ADD CONSTRAINT debian_columns CHECK (
> +        -- 1 == DEB, 2 == UDEB, 5 == DDEB
> +        binpackageformat NOT IN (1, 2, 5)
> +        OR (component IS NOT NULL
> +            AND section IS NOT NULL
> +            AND priority IS NOT NULL));
> +
> +ALTER TABLE BinaryPackagePublishingHistory
> +    ADD COLUMN binarypackageformat integer,
> +    ALTER COLUMN component DROP NOT NULL,
> +    ALTER COLUMN section DROP NOT NULL,
> +    ALTER COLUMN priority DROP NOT NULL,
> +    ADD CONSTRAINT debian_columns CHECK (
> +        (binarypackageformat IS NOT NULL
> +            -- 1 == DEB, 2 == UDEB, 5 == DDEB
> +            AND binarypackageformat NOT IN (1, 2, 5))
> +        OR (component IS NOT NULL
> +            AND section IS NOT NULL
> +            AND priority IS NOT NULL)),
> +    ADD COLUMN channel jsonb;
> +
> +
> +-- Subsequent statements, to be executed live and in subsequent patches.
> +
> +/*
> +-- STEP 2, HOT
> +
> +CREATE INDEX sourcepackagerelease__ci_build__idx
> +    ON SourcePackageRelease (ci_build);
> +
> +CREATE INDEX sourcepackagepublishinghistory__channel__idx
> +    ON SourcePackagePublishingHistory (channel);

It doesn't seem massively likely that unscoped channel is interesting to query on. Inside an archive, maybe.

> +
> +CREATE UNIQUE INDEX binarypackagerelease__bpn__build__version__key
> +    ON BinaryPackageRelease (
> +        binarypackagename, COALESCE(build, ci_build), version);

I'm not quite sure what the purpose of binarypackagerelease_binarypackagename_key was in the first place, given that binarypackagerelease_build_name_uniq is stricter and it doesn't seem interesting for lookups.

Am I missing something, or should we drop it without replacement?

> +CREATE UNIQUE INDEX binarypackagerelease__build__bpn__key
> +    ON BinaryPackageRelease (COALESCE(build, ci_build), binarypackagename);

COALESCEing two columns is almost certainly wrong in a unique constraint like this -- and wrong in a way we won't notice until it explodes by chance in a few years when build.id and cibuild.id happen to overlap for the same BPN. Consider something like (COALESCE(build, -1), COALESCE(ci_build, -1), binarypackagename).

It's also not strictly necessary that we coalesce at all. Two separate (optionally partial) indexes would work just fine.

> +CREATE INDEX binarypackagerelease__ci_build__idx
> +    ON BinaryPackageRelease (ci_build);
> +
> +CREATE INDEX binarypackagepublishinghistory__channel__idx
> +    ON BinaryPackagePublishingHistory (channel);
> +
> +
> +-- STEP 3, COLD
> +
> +-- Replaced by binarypackagerelease__bpn__build__version__key and
> +-- binarypackagerelease__build__bpn__key respectively.
> +ALTER TABLE BinaryPackageRelease
> +    DROP CONSTRAINT binarypackagerelease_binarypackagename_key,
> +    DROP CONSTRAINT binarypackagerelease_build_name_uniq;

There's a slim possibility that this index is used for lookups. But probably not.

> +*/
> +
> +INSERT INTO LaunchpadDatabaseRevision VALUES (2210, 44, 0);


-- 
https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/416727
Your team Launchpad code reviewers is requested to review the proposed merge of ~cjwatson/launchpad:db-artifactory-publish into launchpad:db-devel.