← Back to team overview

ubuntu-translations-coordinators team mailing list archive

Re: An alternative to collectively deleting everything from the import queue which is older than 6 months

 

Arne Goetje wrote:

Objective: to reduce the items in the Needs Review queue to those we care about:

The subject line surprises me. Alternative? We discussed this very recently and concluded that while there are many _additional_ things we could do at the expense of more engineering time, deleting entries that haven't been updated or reviewed for more than six months was an acceptable and useful thing we could do _immediately_.

So why an "alternative"? Besides there being more that we could do, what is actually wrong with the 6-month timeout? If it's deleting too aggressively, we can easily tweak the timeout before the deletions start. But "replace it with something much more advanced" is not something we can give you in quite the same timeframe!


For the active *development* cycle, we have five cases we care about:
* .po without .pot: a package has uploaded .po files without .pot. This is a packaging bug, that we need to identify and get fixed. * .po or .pot with changed file path: if there is already a template in the database, but the file path in the package changed, we need to update the file path in the template to get them imported. * .pot which has been moved (exists already in the db for a different src pkg)
  * new .pot needs approval
  * .po files with language codes which cannot automatically get detected

Then what cases are there that you don't care about? AFAICS everything else gets auto-approved, apart from one further category: "uploads that we simply haven't had time to review." More about that one later.

To help you recognize these cases we'll need to implement new approval mechanisms that integrate more tightly with Soyuz. It's something we've all been hoping to get around to for at least a year now. I see how it's worth doing, but it'll take more time to do than the 6-month timeout. We could probably reuse a lot of Henning's work for the branch auto-approver, but there's more because we have to accommodate more directory layouts.

However, since the active development cycle isn't longer than 6 months, I see all this as completely orthogonal to the 6-month timeout. The 6-month timeout is not going to delete any Lucid uploads until work on Mystic Manitoba is well underway.


For *stable releases* we only care about the following cases:
  * .po or .pot with changed file path
* new .pot needs approval (updates or security upload introduces a new .pot)
  * .po files with language codes which cannot automatically get detected

That leaves only two categories to be deleted: PO files without templates, and uploads that we haven't had time to review.

For PO files without templates, we can't currently tell these from PO files for templates that have moved. We'd need the new approval mechanism for that. But say that we had the ability to recognize PO files for missing templates; wouldn't you still want the PO files to stay on the queue for a while so you could upload a template manually if needed?

FWIW I think "uploads that have been neither approved or updated in 6 months" must be a pretty decent approximation of what you want here. PO files without templates get kept for a while, though probably a bit longer than you want, then discarded. Does it really matter much _why_ an upload gets to the 6-month timeout? If a Lucid upload gets to that point, I would assume that:

1. You're busy with Myopic Manatee or Nihilistic Nit or Opulent Opossum or Prudent Parsnip, and things just aren't going to quiet down enough to let you deal with the entry before Lucid Lynx EOLs.

2. As the entry continues to sit on the queue, it's going to slow you (and the auto-approver) down and annoy you more than it'll ever help you.

3. You probably don't care _that_ much about this particular entry. If the upload matters at all, by now it has a younger sibling in Mystic Meerkat that's much more important to review.


Suggestions how to fix the problem:
* only keep the latest upload of any file in the queue, discard the rest after a short grace period

Of the uploads from package builds, which is what you normally see on the queue, we only keep the latest upload of any file in the queue _at all_. If a new version of the same file is uploaded, it just updates the same queue entry in-place. And its 6-month timer gets reset, of course.

Things are different when the file's path changes in the meantime, but that's part of the harder problems that the approver has to deal with. Things are also different when many different people upload files, but those may still have useful translations so I'm assuming you don't want us to discard them.


* add tests (actually the import script should run these tests already) for the cases described above and add the import failure reason to the entry in the UI.

I would certainly like to work on storing the reason for an approval failure in the queue entry. (I assume you mean approval failure, since we already store reports of import failures). To do that for the existing approval mechanism could be complicated for many cases though. If an entry doesn't get approved, it's not usually because something known and specific is wrong with the entry but because a lot of patterns were tried but none of them matched, and we can't make a safe decision about which one fit best. If we knew exactly what was wrong with the upload, well, we'd fix it instead of giving you an error message!


>  * for templates which have been moved, show the existing reference in
> the UI, e.g. this template exists already in source pkg "foo"

The hard part about that is knowing that a template has moved in the first place. Rosetta gets no more information about this than you do, and unlike you it doesn't have common sense. Right now it doesn't even see other uploads that it could see patterns in.

Luckily there's also an easy part: warning about existing domains. The approver doesn't guess or set a template's domain. You tell it the domain when you approve the template. We could add a check to the form: "it looks like you'll be wanting domain foo for this template but that domain is already in use."

Another thing we could do is produce regular lists of domains that have more than one active template in any given Ubuntu release. It's not much help for new cases, I imagine, but it might be a catalyst for getting existing cases out of the way before they trigger approval conflicts or cause packaging problems.


Jeroen



Follow ups

References