← Back to team overview

launchpad-dev team mailing list archive

contacting LOSAs for staging being down

 

Hey Danilo.  I'm copying launchpad-dev so everybody knows what's up.

Everybody on launchpad-dev, hi.  Staging apparently hasn't been updated since 2011-05-20.  LOSAs are on a sprint.  Danilo is planning to try to check with the LOSAs his tomorrow morning.  If you want to know more, read on.

When I talked to Francis about staging being down, we verified that we were apparently running r10574, according to the bottom of https://staging.launchpad.net/, which is from 2011-05-20.  Because of that, he said that you should call the IS hotline number tomorrow if there is no losa response while they are on the sprint: it's been a pretty long time since staging had a successful update.

I then looked at a graph and at the logs on devpad (e.g. /srv/launchpad.net-logs/staging/sourcherry/2011-05-23-staging_restore.log).  Here's some other data you might already have.

-----

https://lpstats.canonical.com/graphs/StagingRestoreDurations/ shows several restores since the 20th.  It doesn't show success or failure AFAIK, but it does show that run time seems consistent with a healthy restore.

-----

The full restore on the 21st shows this "FATAL" problem, but other restores (such as on the 23rd) do not.  The comment seems to imply that it might be OK for this to fail?  In any case, since other recent restores don't have this message, it is probably unrelated.

# Uninstall Slony-I if it is installed - a pg_dump of a DB with
# Slony-I installed isn't usable without this step.
LPCONFIG=staging-setup  ./repair-restored-db.py
/tmp/slonik_qCFRY.sk:3: FATAL:  database "dbname=lpmain_staging_new" does not exist
2011-05-21 18:32:14 ERROR   slonik script failed

-----

I suspect that the following indicates a problem we need to look at for our own purposes, but that is not pertinent to the staging restore failure.  Near the end of the staging restore logs of the 23rd (but not the 21st or 20th!), I saw 

Tue May 24 15:35:51 UTC 2011 Send bug notifications
2011-05-24 15:38:16 ERROR   Error while building email notifications.
 -> http://staging.launchpadlibrarian.net/72065616/8rdUH0Yw2QTwDT8HY6ZxKEVo2tB.txt (61963)

That referenced .txt file shows this traceback.

Traceback (most recent call last):
  File "/srv/staging.launchpad.net/staging/launchpad/lib/lp/bugs/scripts/bugnotification.py", line 290, in get_email_notifications
    yield construct_email_notifications(batch)
  File "/srv/staging.launchpad.net/staging/launchpad/lib/lp/bugs/scripts/bugnotification.py", line 177, in construct_email_notifications
    bug, recipients, filtered_notifications)
  File "/srv/staging.launchpad.net/staging/launchpad/lib/lp/bugs/model/bugnotification.py", line 278, in getRecipientFilterData
    del recipient_id_map[person_id]['filters'][filter_id]
KeyError: 61963

As I said, this is not in the logs for the 21st or 20th, so it probably is just a problem for us, but not the staging problem.  Or maybe it's entirely spurious.  It's worth investigating though.  If you get the LOSA's attention and you think it makes sense, could you ask them to run "cronscripts/send-bug-notifications.py -vv" on staging and give you the output, to see if this is actually an issue?

OK, that's all I know. :-)  Maybe others on -dev will have some input.

Thank you!

Gary

Follow ups