yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #78672
[Bug 1815676] Re: DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers
** Changed in: neutron
Status: In Progress => Invalid
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1815676
Title:
DVR: External process monitor for keepalived should be removed when
external gateway is removed for DVR HA routers
Status in neutron:
Invalid
Bug description:
External process monitor for keepalived state change should be removed when the External Gateway is removed for DVR HA routers.
We have seen under certain conditions when the SNAT namespace is missing, the External process Monitor is try to respawn the keepalived state change monitor process within the namespace.
But the External process monitor does not check for the SNAT namespace and it is up to the process that calls it.
The 'delete' ha-router takes care of cleaning the external process
monitor subscription for the keepalived state change, but the external
gateway remove function is not calling this function.
This is how I was able to reproduce the problem.
But this is how I was able to reproduce.
Create HA/DVR routers
Delete the SNAT Namespace of the routers.
Also delete the PID files for the ip_monitor under /opt/stack/data/neutron/external/pids/ip_monitor pid
Once deleted I was able to see the log message in the
neutron-l3.service logs.
`
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.external_process [-] ip_monitor for router with uuid
04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have died
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: WARNING neutron.agent.linux.external_process [-] Respawning ip_monitor for uui
d 04fabe76-9316-4270-a99f-4f0ccffb8feb
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG neutron.agent.linux.utils [-] Unable to access /opt/stack/data/neutron/e
xternal/pids/04fabe76-9316-4270-a99f-4f0ccffb8feb.monitor.pid {{(pid=12153) get_value_from_file /opt/stack/neutron/neutron/agent/linux/utils
.py:250}}
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip',
'netns', 'exec', 'snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 'neutron-keepalived-state-change', '--router_id=04fabe76-9316-4270-a99f-4f0ccf
fb8feb', '--namespace=snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', '--conf_dir=/opt/stack/data/neutron/ha_confs/04fabe76-9316-4270-a99f-4f0cc
ffb8feb', '--monitor_interface=ha-4af17105-bd', '--monitor_cidr=169.254.0.1/24', '--pid_file=/opt/stack/data/neutron/external/pids/04fabe76-
9316-4270-a99f-4f0ccffb8feb.monitor.pid', '--state_path=/opt/stack/data/neutron', '--user=1000', '--group=1004'] {{(pid=12153) execute_rootw
rap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:103}}
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: C
annot open network namespace "snat-04fabe76-9316-4270-a99f-4f0ccffb8feb": No such file or directory
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]:
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG oslo_concurrency.lockutils [-] Lock "_check_child_processes" released by
"neutron.agent.linux.external_process._check_child_processes" :: held 0.007s {{(pid=12153) inner /usr/local/lib/python2.7/dist-packages/osl
o_concurrency/lockutils.py:285}}
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Traceback (most recent call last):
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460
, in fire_timers
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1815676/+subscriptions
References