← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~cjwatson/python-oops-amqp/publisher-handle-channel-errors into lp:python-oops-amqp

 

Colin Watson has proposed merging lp:~cjwatson/python-oops-amqp/publisher-handle-channel-errors into lp:python-oops-amqp.

Commit message:
Handle AMQP channel errors (particularly NotFound) in the publisher.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~cjwatson/python-oops-amqp/publisher-handle-channel-errors/+merge/367748

amqp 2.4.0 included a change to drain events before publishing.  This means that if we try to publish an OOPS to a nonexistent exchange, then some future publishing attempt will raise a NotFound exception, which is a channel error rather than a connection error and so wasn't previously handled.

To try to minimise confusion resulting from this (which can be considerable - it took me several days to track down what was happening in Launchpad's test suite), spend a short time waiting for a response for the broker after publishing an OOPS.  This will typically allow us to detect channel errors immediately, which we now handle; even if we don't manage to handle them immediately, they'll be handled the next time we try to publish something on the same channel.

It would be possible to handle channel errors more economically by just reopening a channel on the same connection rather than reopening the entire connection, but reopening the connection seems to work well enough for now.
-- 
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~cjwatson/python-oops-amqp/publisher-handle-channel-errors into lp:python-oops-amqp.
=== modified file 'NEWS'
--- NEWS	2018-05-09 12:31:40 +0000
+++ NEWS	2019-05-22 09:39:41 +0000
@@ -3,6 +3,11 @@
 
 Changes and improvements to oops-amqp, grouped by release.
 
+0.1.1
+-----
+
+* Handle AMQP channel errors (particularly NotFound) in the publisher.
+
 0.1.0
 -----
 

=== modified file 'oops_amqp/publisher.py'
--- oops_amqp/publisher.py	2018-03-12 11:48:55 +0000
+++ oops_amqp/publisher.py	2019-05-22 09:39:41 +0000
@@ -20,6 +20,7 @@
 __metaclass__ = type
 
 from hashlib import md5
+import socket
 from threading import local
 
 import amqp
@@ -96,6 +97,16 @@
         try:
             channel.basic_publish(
                 message, self.exchange_name, routing_key=self.routing_key)
+            # See if we get a NotFound error back straight away, to avoid
+            # the event still being on the queue and causing confusion when
+            # publishing future reports.  Don't hang around too long waiting
+            # for it; if we don't hear anything back quickly, just close the
+            # connection and let the next call open a new one.
+            try:
+                channel.connection.drain_events(timeout=1)
+            except socket.timeout:
+                channel.close()
+                self.channels.channel = None
         except amqplib_error_types as e:
             self.channels.channel = None
             if is_amqplib_connection_error(e):

=== modified file 'oops_amqp/tests/test_publisher.py'
--- oops_amqp/tests/test_publisher.py	2018-03-12 11:48:55 +0000
+++ oops_amqp/tests/test_publisher.py	2019-05-22 09:39:41 +0000
@@ -137,3 +137,11 @@
             queue.channel = connection.channel()
         self.assertNotEqual([], publisher(oops))
 
+    def test_publish_amqp_exchange_not_found(self):
+        channel = self.useFixture(
+            ChannelFixture(self.connection_factory)).channel
+        self.useFixture(QueueFixture(channel, self.getUniqueString))
+        publisher = Publisher(
+            self.connection_factory, self.getUniqueString(), "")
+        oops = {'akey': 42}
+        self.assertEqual([], publisher(oops))

=== modified file 'oops_amqp/utils.py'
--- oops_amqp/utils.py	2018-03-12 11:48:55 +0000
+++ oops_amqp/utils.py	2019-05-22 09:39:41 +0000
@@ -19,7 +19,10 @@
 
 import socket
 
-from amqp.exceptions import ConnectionError
+from amqp.exceptions import (
+    ChannelError,
+    ConnectionError,
+    )
 
 __all__ = [
     'amqplib_error_types',
@@ -31,7 +34,7 @@
 # These exception types always indicate an AMQP connection error/closure.
 # However you should catch amqplib_error_types and post-filter with
 # is_amqplib_connection_error.
-amqplib_connection_errors = (socket.error, ConnectionError)
+amqplib_connection_errors = (socket.error, ConnectionError, ChannelError)
 # A tuple to reduce duplication in different code paths. Lists the types of
 # exceptions legitimately raised by amqplib when the AMQP server goes down.
 # Not all exceptions *will* be such errors - use is_amqplib_connection_error to


Follow ups