← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1929832] [NEW] stable/ussuri py38 support for keepalived-state-change monitor

 

Public bug reported:

The victoria release of Openstack received patch [1] which allows the
neutron-l3-agent to SIGKILL or SIGTERM the keepalived-state-change
monitor when running under py38. This patch is needed in Ussuri for
users running with py38 so we need to backport it.

The consequence of not having this is that you get the following when
you delete or disable a router:

2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent [req-8c69af29-8f9c-4721-9cba-81ff4e9be92c - 9320f5ac55a04fb280d9ceb0b1106a6e - - -] Error while deleting router ab63ccd8-1197-48d0-815e-31adc40e5193: neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 512, in _safe_router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self._router_removed(ri, router_id)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 548, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self.router_info[router_id] = ri
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self.force_reraise()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     raise value
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 545, in _router_removed
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     ri.delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/dvr_edge_router.py", line 236, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     super(DvrEdgeRouter, self).delete()
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 492, in delete
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self.destroy_state_change_monitor(self.process_monitor)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 438, in destroy_state_change_monitor
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     pm.disable(sig=str(int(signal.SIGTERM)))
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/linux/external_process.py", line 113, in disable
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     utils.execute(cmd, run_as_root=self.run_as_root)
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py", line 147, in execute
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     raise exceptions.ProcessExecutionError(msg,
2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)

Which results in the router being deleted from neutron but not the node.
In my case i had both a qrouter and snat ns left with IPs still
configured as well as my fip ip rule allocation still present in
/var/lib/neutron/fip-priorities

[1]
https://github.com/openstack/neutron/commit/4fb505891ee32ae41247f1d7a48b7455b342840e

** Affects: cloud-archive
     Importance: Undecided
         Status: Invalid

** Affects: cloud-archive/ussuri
     Importance: High
         Status: Triaged

** Affects: neutron
     Importance: Undecided
     Assignee: Edward Hope-Morley (hopem)
         Status: In Progress

** Affects: neutron (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: neutron (Ubuntu Focal)
     Importance: High
         Status: Triaged

** Changed in: neutron
       Status: New => In Progress

** Changed in: neutron
     Assignee: (unassigned) => Edward Hope-Morley (hopem)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1929832

Title:
  stable/ussuri py38 support for keepalived-state-change monitor

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive ussuri series:
  Triaged
Status in neutron:
  In Progress
Status in neutron package in Ubuntu:
  Invalid
Status in neutron source package in Focal:
  Triaged

Bug description:
  The victoria release of Openstack received patch [1] which allows the
  neutron-l3-agent to SIGKILL or SIGTERM the keepalived-state-change
  monitor when running under py38. This patch is needed in Ussuri for
  users running with py38 so we need to backport it.

  The consequence of not having this is that you get the following when
  you delete or disable a router:

  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent [req-8c69af29-8f9c-4721-9cba-81ff4e9be92c - 9320f5ac55a04fb280d9ceb0b1106a6e - - -] Error while deleting router ab63ccd8-1197-48d0-815e-31adc40e5193: neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent Traceback (most recent call last):
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 512, in _safe_router_removed
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self._router_removed(ri, router_id)
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 548, in _router_removed
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self.router_info[router_id] = ri
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self.force_reraise()
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     raise value
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/agent.py", line 545, in _router_removed
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     ri.delete()
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/dvr_edge_router.py", line 236, in delete
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     super(DvrEdgeRouter, self).delete()
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 492, in delete
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     self.destroy_state_change_monitor(self.process_monitor)
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/l3/ha_router.py", line 438, in destroy_state_change_monitor
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     pm.disable(sig=str(int(signal.SIGTERM)))
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/linux/external_process.py", line 113, in disable
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     utils.execute(cmd, run_as_root=self.run_as_root)
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent   File "/usr/lib/python3/dist-packages/neutron/agent/linux/utils.py", line 147, in execute
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent     raise exceptions.ProcessExecutionError(msg,
  2021-05-26 02:11:44.653 3457514 ERROR neutron.agent.l3.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 99; Stdin: ; Stdout: ; Stderr: /usr/bin/neutron-rootwrap: Unauthorized command: kill -15 2516433 (no filter matched)

  Which results in the router being deleted from neutron but not the
  node. In my case i had both a qrouter and snat ns left with IPs still
  configured as well as my fip ip rule allocation still present in
  /var/lib/neutron/fip-priorities

  [1]
  https://github.com/openstack/neutron/commit/4fb505891ee32ae41247f1d7a48b7455b342840e

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1929832/+subscriptions


Follow ups