← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1860326] [NEW] Kill neutron-keepalived-state-change-monitor fails

 

Public bug reported:

In case when graceful shutdown of neutron-keepalived-state-change-
monitor with SIGTERM fails, Neutron will try to kill it with SIGKILL but
as there is no correct rootwrap rule to kill it with -9 it will fail
with error like:

2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent [-] Error while deleting router f2613902-6ea2-4f09-9fae-9d5a933c744e: multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Unserializable message: Traceback (most recent call last):
  File "/usr/lib64/python3.6/multiprocessing/managers.py", line 283, in serve_client
    send(msg)
  File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 128, in send
    s = self.dumps(obj)
  File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 170, in dumps
    return json.dumps(obj, cls=RpcJSONEncoder).encode('utf-8')
  File "/usr/lib64/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 43, in default
    return super(RpcJSONEncoder, self).default(o)
  File "/usr/lib64/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'ValueError' is not JSON serializable

---------------------------------------------------------------------------
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/common/utils.py", line 702, in wait_until_true
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     eventlet.sleep(sleep)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/eventlet/greenthread.py", line 36, in sleep
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     hub.switch()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     return self.greenlet.switch()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent eventlet.timeout.Timeout: 10 seconds
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent During handling of the above exception, another exception occurred:
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 420, in destroy_state_change_monitor
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     timeout=SIGTERM_TIMEOUT)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/common/utils.py", line 707, in wait_until_true
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise WaitTimeout(_("Timed out after %d seconds") % timeout)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent neutron.common.utils.WaitTimeout: Timed out after 10 seconds
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent During handling of the above exception, another exception occurred:
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 506, in _safe_router_removed
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self._router_removed(ri, router_id)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 542, in _router_removed
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.router_info[router_id] = ri
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.force_reraise()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise value
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 539, in _router_removed
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     ri.delete()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 478, in delete
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.destroy_state_change_monitor(self.process_monitor)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 422, in destroy_state_change_monitor
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     pm.disable(sig=str(int(signal.SIGKILL)))
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/external_process.py", line 113, in disable
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     utils.execute(cmd, run_as_root=self.run_as_root)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 122, in execute
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     execute_rootwrap_daemon(cmd, process_input, addl_env))
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     LOG.error("Rootwrap error running command: %s", cmd)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.force_reraise()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise value
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     return client.execute(cmd, process_input)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/client.py", line 154, in execute
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     res = self._run_one_command(proxy, cmd, stdin)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/client.py", line 139, in _run_one_command
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     res = proxy.run_one_command(cmd, stdin)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "<string>", line 2, in run_one_command
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise convert_to_error(kind, result)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent multiprocessing.managers.RemoteError:
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent ---------------------------------------------------------------------------
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Unserializable message: Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib64/python3.6/multiprocessing/managers.py", line 283, in serve_client
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     send(msg)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 128, in send
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     s = self.dumps(obj)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 170, in dumps
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     return json.dumps(obj, cls=RpcJSONEncoder).encode('utf-8')
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib64/python3.6/json/__init__.py", line 238, in dumps....

** Affects: neutron
     Importance: Medium
     Assignee: Slawek Kaplonski (slaweq)
         Status: Confirmed


** Tags: l3-ha

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1860326

Title:
  Kill neutron-keepalived-state-change-monitor fails

Status in neutron:
  Confirmed

Bug description:
  In case when graceful shutdown of neutron-keepalived-state-change-
  monitor with SIGTERM fails, Neutron will try to kill it with SIGKILL
  but as there is no correct rootwrap rule to kill it with -9 it will
  fail with error like:

  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent [-] Error while deleting router f2613902-6ea2-4f09-9fae-9d5a933c744e: multiprocessing.managers.RemoteError:
  ---------------------------------------------------------------------------
  Unserializable message: Traceback (most recent call last):
    File "/usr/lib64/python3.6/multiprocessing/managers.py", line 283, in serve_client
      send(msg)
    File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 128, in send
      s = self.dumps(obj)
    File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 170, in dumps
      return json.dumps(obj, cls=RpcJSONEncoder).encode('utf-8')
    File "/usr/lib64/python3.6/json/__init__.py", line 238, in dumps
      **kw).encode(obj)
    File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode
      chunks = self.iterencode(o, _one_shot=True)
    File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode
      return _iterencode(o, 0)
    File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 43, in default
      return super(RpcJSONEncoder, self).default(o)
    File "/usr/lib64/python3.6/json/encoder.py", line 180, in default
      o.__class__.__name__)
  TypeError: Object of type 'ValueError' is not JSON serializable

  ---------------------------------------------------------------------------
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/common/utils.py", line 702, in wait_until_true
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     eventlet.sleep(sleep)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/eventlet/greenthread.py", line 36, in sleep
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     hub.switch()
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     return self.greenlet.switch()
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent eventlet.timeout.Timeout: 10 seconds
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent During handling of the above exception, another exception occurred:
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 420, in destroy_state_change_monitor
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     timeout=SIGTERM_TIMEOUT)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/common/utils.py", line 707, in wait_until_true
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise WaitTimeout(_("Timed out after %d seconds") % timeout)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent neutron.common.utils.WaitTimeout: Timed out after 10 seconds
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent During handling of the above exception, another exception occurred:
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 506, in _safe_router_removed
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self._router_removed(ri, router_id)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 542, in _router_removed
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.router_info[router_id] = ri
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.force_reraise()
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise value
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 539, in _router_removed
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     ri.delete()
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 478, in delete
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.destroy_state_change_monitor(self.process_monitor)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 422, in destroy_state_change_monitor
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     pm.disable(sig=str(int(signal.SIGKILL)))
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/external_process.py", line 113, in disable
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     utils.execute(cmd, run_as_root=self.run_as_root)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 122, in execute
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     execute_rootwrap_daemon(cmd, process_input, addl_env))
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     LOG.error("Rootwrap error running command: %s", cmd)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     self.force_reraise()
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise value
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     return client.execute(cmd, process_input)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/client.py", line 154, in execute
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     res = self._run_one_command(proxy, cmd, stdin)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/client.py", line 139, in _run_one_command
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     res = proxy.run_one_command(cmd, stdin)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "<string>", line 2, in run_one_command
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     raise convert_to_error(kind, result)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent multiprocessing.managers.RemoteError:
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent ---------------------------------------------------------------------------
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Unserializable message: Traceback (most recent call last):
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib64/python3.6/multiprocessing/managers.py", line 283, in serve_client
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     send(msg)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 128, in send
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     s = self.dumps(obj)
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 170, in dumps
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent     return json.dumps(obj, cls=RpcJSONEncoder).encode('utf-8')
  2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent   File "/usr/lib64/python3.6/json/__init__.py", line 238, in dumps....

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1860326/+subscriptions


Follow ups