← Back to team overview

launchpad-dev team mailing list archive

[bug 703840] xmlrpc.edge outage postmortem

 

It looks like we tried to change xmlrpc.edge.l.n to redirect to plain
xmlprc.l.n last night, which broke lp: urls for many people using bzr.
 It was promptly reverted (about 30m later).  <http://pad.lv/703840>

bzr now uses xmlrpc.l.n not xmlrpc.edge, but that change is only in
Maverick and Natty.

Because of a bug or limitation in the Python xmlrpc libraries, they
cannot understand 40x redirects, even in Natty.
<http://pad.lv/497131>

I think the best course is to just continue supporting xmlrpc.e.l.n,
without a redirect, for the supported life of Lucid.  It would
probably work for it to just be a ServerAlias for xmlrpc.l.n, rather
than a redirect to it.  I don't think this has any of the issues that
are present in doing a redirect for rest apis, because the responses
don't depend on the api endpoint location.  Obviously when we do that

All the factors going into this were known to collectively us last
week: xmlrpclib doesn't follow redirects, and old bzrs try to reach
edge, therefore redirecting it will break them.  So it does seem like
we could have done better here.  Perhaps that would have required
somebody who knew these facts to comment on the bug about redirecting
edge (assuming there was one.)

We could also have done better by knowing that xmlrpc was used by bzr
and therefore we should test bzr after making a production change.

Eventually we ought to just stop sending xmlrpc queries altogether;
this predated rest APIs and it's not  necessary anymore.
<http://pad.lv/397739>  That bug may now be fairly easy to fix?

-- 
Martin



Follow ups