← Back to team overview

launchpad-dev team mailing list archive

Re: Does our DB retry code need tweaks for the PG 9.1 upgrade?

 

On Fri, Jun 1, 2012 at 9:07 PM, Francis J. Lacoste
<francis.lacoste@xxxxxxxxxxxxx> wrote:
> Hi Stuart,
>
> Ok, thanks for the clarifications. Do you have time to handle this bug,
> or should I ask the maintenance team to have a go at it?

Its several bugs.

I won't be much help diagnosing why TCP connections are dying.

I can backport and get pgbouncer updated if we want to try this.
Swapping it in will chew up a FDT slot.

I think there is a Storm bug, although others disagree. I'm not sure
why a socket going tits up is different from any other sort of
disconnection. At the moment, I think when the TCP connection fails
like this Storm doesn't reset the connection so subsequent requests
will also fail (it will probably get an exception it does handle
eventually, so the connection reopening will happen). I can fix this
if I can convince people it is a bug - a few of us on the team have
adjusted this code before as the rarer types of failed connections
have been discovered or changed due to updates.

> Do we have a RT for the pgbouncer upgrade yet?

No. We haven't the newer version on staging yet either. Is this
happening enough to panic over and skip staging? I expect the
Librarian has been restarted as part of our regular code updates which
should clean out  the more esoteric issues.

-- 
Stuart Bishop <stuart.bishop@xxxxxxxxxxxxx>


Follow ups

References