← Back to team overview

launchpad-dev team mailing list archive

Re: bug notifications database utilization

 

On 2011-03-26 09:04, Gary Poster wrote:

2011-03-23 15:15:08 INFO    Notifying xxx about bug 736049.
2011-03-23 15:15:08 INFO    Notifying xxx about bug 736049.
...
2011-03-23 15:15:16 INFO    Notifying xxx about bug 733732.

Often also the notification lines in the log are several seconds or more
apart, indicating the call to sendmail() blocks for a time. So I have 2
questions:

1. How is the new script invocation happening if the old one appears to
still be running? My theory is that the new script starts and blocks
until the old one finishes. And if the next one is slow too, then it all
compounds....
That doesn't quite jibe with what I think we see here, but I could be wrong.  The core issue does appear to be that it seems to be possible for a script to run simultaneously, though we haven't caught that smoking gun yet.
If you just call script.run(), multiple instances of the script can run 
simultaneously — though of course they may still block each other out in 
the database or elsewhere.  There's also lock_and_run, which can 
optionally block for the lock to become available but does not block by 
default.
Multiple instances blocking each other out in the database seems more 
likely if they eat out of the same queue.  For example, an attempt to 
delete the record at the head of the queue will block on another 
transaction that has deleted the same record (but has not committed 
yet).  If there's some other blocking lock involved as well, e.g. in 
sendmail itself, then the two script instances could even deadlock 
outside the database's field of view.
I sometimes find stone-age profiling helpful with scripts: ctrl-C the 
thing and see what the traceback says it was doing.

Jeroen



References