← Back to team overview

launchpad-dev team mailing list archive

Re: anyone else finding ec2land things disappearing w/out warning?

 

On 10/25/2010 07:41 AM, Brad Crittenden wrote:
> 
> On Oct 25, 2010, at 18:34 , Jonathan Lange wrote:
> 
>> On Mon, Oct 25, 2010 at 11:54 AM, Brad Crittenden 
>> <brad.crittenden@xxxxxxxxxxxxx> wrote: ...
>>> It might be handy if ec2 always CC-ed a common mailbox that the build
>>> engineer could periodically examine to see all of the failures that
>>> people are getting that may go unreported.
>>> 
>>> What do you think?
>> 
>> It's a good idea. It'd be easy enough to set up a postfix rule that puts
>> the attachment into a testr database too.
>> 
>> Not sure we should rely too heavily on "build engineer" as a role that 
>> involves checking and monitoring things though.
> 
> I phrased my statement poorly and didn't mean to imply a human nagios.
> Instead I think it would be useful as a means to evaluate failure patterns
> when curious.  Should be more effective than expecting engineers to send in
> reports or querying the group via email.
> 
> --Brad
> 

That is a good idea, and I have considered it.

I do not think an email alert will catch hung testrunners because an email
implementation will probably not send granular enough messages about what the
runner is doing.  Instead, I would consider installing a beacon in ec2 test that
sends HTTP POSTs to a central CGI script.  The beacon would report the start,
stop, report, and shutdown events for each run.  Auditing the logs would catch
hung, disappeared, or otherwise AWOL runners.  (BTW, web.py is awesome for
building such small web apps, and it is already on devpad for this purpose. Hint
hint ;)

It is really difficult to gather facts about a randomly occurring error in a
randomly run process initiated by 30 developers on a globally distributed team.
 I really think that automated data gathering makes sense as the next step.

-- 
Māris Fogels -- https://launchpad.net/~mars
Launchpad.net -- cross-project collaboration and hosting

Attachment: signature.asc
Description: OpenPGP digital signature


Follow ups

References