launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #07764
Re: Help! Knowing what packages are in a distribution
On Thursday 18 August 2011 16:11:00 Curtis Hovey wrote:
> This is a long email. If you have knowledge of packages, source package
> branches, and distros, please attempt to reply.
I'm attempting. It's been a long day...
> A DSP is a valid choice if:
> * The source was published by the distro.
> Once published, it is forever a valid choice.
> <https://launchpad.net/ubuntu/+source/mysql>
> <https://launchpad.net/ubuntu/+source/devicekit> (still published)
> <https://launchpad.net/ubuntu/+source/devicekit> (deleted,
> yet still valid)
How did you decide on the validity rules? Particularly the last one. (I am
curious)
> Packages that were never published nor had a branch are not valid.
> <https://launchpad.net/ubuntu/+source/epsoneplijs>
> Was never published in Ubuntu. It was never published in any
> debian-based distro. No distro has a branch for it. It is invalid.
That page exists probably because the package exists in a PPA. Does presence
in a PPA count for anything?
(sorry, more questions than answers so far)
> Though there is work going on to demoralise the data to make the DSP
This is one of the best typos I've ever seen you make. If it's deliberate to
see if we were paying attention, then kudos. If it wasn't, you still get
kudos for giving me a massive belly laugh.
> queries fast, I doubt the vocab is a viable path to reach the goal shown
> in the demo. We need to search pillars, packages and branches quickly.
>
> The rules imply that once a DSP determined to be valid, it is always
> valid, Lp needs to recheck invalid DSPs that may have been added.
As I mentioned above, why do you need deleted publications to be valid?
> I believe we need a resource that represents every DSP that is a valid
> choice. Lp does have a DSP table, and parts of Lp does treat it as a
> definitive resource, but it is flawed as William Grant pointed out. The
> table has two purposes, Store DSP information that is beyond the package
> and branch, like bug supervisor or bug reporting guidelines. It also
> stores facts like PO message count and bug heat. We want this data for
> every valid DSP, but not every DSP is in the table, and there are more
> than a thousand impossible DSPs in the table.
So there's two DSPs in the model code, as I suspect you know. There's a fake
one and a real one, which is in the DB.
We thought a while ago that getting rid of the fake one is a seriously good
idea but there was never enough impetus to do the work. It would solve a raft
of problems with performance on some of the Soyuz pages.
The idea is that it would be a cache of the latest publication data and other
bits required from other tables. Keeping it up to date is pretty easy since
we create all publications in once place in the code.
I'd be very happy to see this work done and the old DSP fake removed.
(I forgot the name of this data pattern but it's a common one I've seen and
used before)
>
> In the case if impossible DSPs in the table, they are mostly historic
> entries created by users to targeted a bug to a distro that does not use
> Lp to manage bugs. There is only one bad entry for Ubuntu, there are
> many for Debian that need investigation. There are about 1000 rows for
> distros that never used Lp to track bugs, do not have publishing
> history, do not have branches.
>
> There are two reason for missing rows. Changes were made in the last
> year or two to ensure that every uploaded source package has a DSP
> entry. Packages that were in older Ubuntu releases are not present. I
> image we could make the missing DSP rows by mining the source package
> publishing history table.
Presumably this is only packages that are in older Ubuntu series but not in
the latest?
In that case it needs entries for those packages to record that the latest
status is "deleted".
Yes, we need to get some hard-hats on and go mining to fill these gaps.
> We have no mechanism to ensure that source package branches are
> represented in the DSP table.
What needs to happen if there's a branch for an unpublished source?
> Maybe the branch scanner could do this.
If it did, we'd need to be careful about what publishing data it has as per my
idea above.
> Principia, a distro with just branches does have rows in the DSP table
> for the packages the owners have configured or have bugs. Most are not
> represented, so bugs cannot be retargeted to them. How do we ensure that
> source package branches are in the table?
It seems like the table needs to be written to from two different places, the
branch scanner and the Soyuz publishing code. Obviously that has the caveat
of locking, data integrity and data overlap. Maybe there needs to be another
table?
Follow ups
References