ubuntu-translations-coordinators team mailing list archive
-
ubuntu-translations-coordinators team
-
Mailing list archive
-
Message #00350
Re: An alternative to collectively deleting everything from the import queue which is older than 6 months
Arne Goetje wrote:
Objective: to reduce the items in the Needs Review queue to those we
care about:
The subject line surprises me. Alternative? We discussed this very
recently and concluded that while there are many _additional_ things we
could do at the expense of more engineering time, deleting entries that
haven't been updated or reviewed for more than six months was an
acceptable and useful thing we could do _immediately_.
So why an "alternative"? Besides there being more that we could do,
what is actually wrong with the 6-month timeout? If it's deleting too
aggressively, we can easily tweak the timeout before the deletions
start. But "replace it with something much more advanced" is not
something we can give you in quite the same timeframe!
For the active *development* cycle, we have five cases we care about:
* .po without .pot: a package has uploaded .po files without .pot.
This is a packaging bug, that we need to identify and get fixed.
* .po or .pot with changed file path: if there is already a template
in the database, but the file path in the package changed, we need to
update the file path in the template to get them imported.
* .pot which has been moved (exists already in the db for a different
src pkg)
* new .pot needs approval
* .po files with language codes which cannot automatically get detected
Then what cases are there that you don't care about? AFAICS everything
else gets auto-approved, apart from one further category: "uploads that
we simply haven't had time to review." More about that one later.
To help you recognize these cases we'll need to implement new approval
mechanisms that integrate more tightly with Soyuz. It's something we've
all been hoping to get around to for at least a year now. I see how
it's worth doing, but it'll take more time to do than the 6-month
timeout. We could probably reuse a lot of Henning's work for the branch
auto-approver, but there's more because we have to accommodate more
directory layouts.
However, since the active development cycle isn't longer than 6 months,
I see all this as completely orthogonal to the 6-month timeout. The
6-month timeout is not going to delete any Lucid uploads until work on
Mystic Manitoba is well underway.
For *stable releases* we only care about the following cases:
* .po or .pot with changed file path
* new .pot needs approval (updates or security upload introduces a new
.pot)
* .po files with language codes which cannot automatically get detected
That leaves only two categories to be deleted: PO files without
templates, and uploads that we haven't had time to review.
For PO files without templates, we can't currently tell these from PO
files for templates that have moved. We'd need the new approval
mechanism for that. But say that we had the ability to recognize PO
files for missing templates; wouldn't you still want the PO files to
stay on the queue for a while so you could upload a template manually if
needed?
FWIW I think "uploads that have been neither approved or updated in 6
months" must be a pretty decent approximation of what you want here. PO
files without templates get kept for a while, though probably a bit
longer than you want, then discarded. Does it really matter much _why_
an upload gets to the 6-month timeout? If a Lucid upload gets to that
point, I would assume that:
1. You're busy with Myopic Manatee or Nihilistic Nit or Opulent Opossum
or Prudent Parsnip, and things just aren't going to quiet down enough to
let you deal with the entry before Lucid Lynx EOLs.
2. As the entry continues to sit on the queue, it's going to slow you
(and the auto-approver) down and annoy you more than it'll ever help you.
3. You probably don't care _that_ much about this particular entry. If
the upload matters at all, by now it has a younger sibling in Mystic
Meerkat that's much more important to review.
Suggestions how to fix the problem:
* only keep the latest upload of any file in the queue, discard the
rest after a short grace period
Of the uploads from package builds, which is what you normally see on
the queue, we only keep the latest upload of any file in the queue _at
all_. If a new version of the same file is uploaded, it just updates
the same queue entry in-place. And its 6-month timer gets reset, of course.
Things are different when the file's path changes in the meantime, but
that's part of the harder problems that the approver has to deal with.
Things are also different when many different people upload files, but
those may still have useful translations so I'm assuming you don't want
us to discard them.
* add tests (actually the import script should run these tests already)
for the cases described above and add the import failure reason to the
entry in the UI.
I would certainly like to work on storing the reason for an approval
failure in the queue entry. (I assume you mean approval failure, since
we already store reports of import failures). To do that for the
existing approval mechanism could be complicated for many cases though.
If an entry doesn't get approved, it's not usually because something
known and specific is wrong with the entry but because a lot of patterns
were tried but none of them matched, and we can't make a safe decision
about which one fit best. If we knew exactly what was wrong with the
upload, well, we'd fix it instead of giving you an error message!
> * for templates which have been moved, show the existing reference in
> the UI, e.g. this template exists already in source pkg "foo"
The hard part about that is knowing that a template has moved in the
first place. Rosetta gets no more information about this than you do,
and unlike you it doesn't have common sense. Right now it doesn't even
see other uploads that it could see patterns in.
Luckily there's also an easy part: warning about existing domains. The
approver doesn't guess or set a template's domain. You tell it the
domain when you approve the template. We could add a check to the form:
"it looks like you'll be wanting domain foo for this template but that
domain is already in use."
Another thing we could do is produce regular lists of domains that have
more than one active template in any given Ubuntu release. It's not
much help for new cases, I imagine, but it might be a catalyst for
getting existing cases out of the way before they trigger approval
conflicts or cause packaging problems.
Jeroen
Follow ups
References