yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87948
[Bug 1955491] [NEW] [DHCP] Neutron DHCP agent failing when disabling the Linux DHCP service
Public bug reported:
This issue has been detected running Neutron Train (Red Hat OSP 16.2),
using TripleO as deployment tool. The services run on containers, using
podman.
The DHCP tries to stop the disable the DHCP helper. That calls the
driver "disable" method [1]. In Linux that will call [2], that will try
to stop the running process. In devstack, this process is a "dnsmasq"
instance running on the DHCP namespace. In TripleO, the DHCP agent
container will spawn a sidecar container to execute the "dnsmasq"
instance. That requires a specific kill script [3].
In this deployment, the DHCP agent is returning exit code 125 when trying to disable the "dnsmasq" process (running in a container):
neutron_lib.exceptions.ProcessExecutionError: Exit code: 125; Stdin: ; Stdout: ; Stderr:
This error code comes from "podman" and could be cause because the container is not present in the system. That will raise an exception [4] that will re schedule a resync. The DHCP agent will enter in an endless loop unless restarted. That will remove from "self.cache = NetworkCache()" the affected network that is triggering the exception.
Logs DHCP agent (snippet): [4]
Bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=2032010
[1]https://github.com/openstack/neutron/blob/df9435a9a6fab9492c4f23d9ab0f1507841430c7/neutron/agent/dhcp/agent.py#L413-L426
[2]https://github.com/openstack/neutron/blob/df9435a9a6fab9492c4f23d9ab0f1507841430c7/neutron/agent/linux/dhcp.py#L305-L313
[3]https://github.com/openstack/tripleo-heat-templates/blob/25db32d4e5ed7ed4687bbb6d07a8a87ad65b71e6/deployment/neutron/kill-script
[4]https://paste.opendev.org/show/811802/
** Affects: neutron
Importance: Undecided
Assignee: Rodolfo Alonso (rodolfo-alonso-hernandez)
Status: New
** Changed in: neutron
Assignee: (unassigned) => Rodolfo Alonso (rodolfo-alonso-hernandez)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1955491
Title:
[DHCP] Neutron DHCP agent failing when disabling the Linux DHCP
service
Status in neutron:
New
Bug description:
This issue has been detected running Neutron Train (Red Hat OSP 16.2),
using TripleO as deployment tool. The services run on containers,
using podman.
The DHCP tries to stop the disable the DHCP helper. That calls the
driver "disable" method [1]. In Linux that will call [2], that will
try to stop the running process. In devstack, this process is a
"dnsmasq" instance running on the DHCP namespace. In TripleO, the DHCP
agent container will spawn a sidecar container to execute the
"dnsmasq" instance. That requires a specific kill script [3].
In this deployment, the DHCP agent is returning exit code 125 when trying to disable the "dnsmasq" process (running in a container):
neutron_lib.exceptions.ProcessExecutionError: Exit code: 125; Stdin: ; Stdout: ; Stderr:
This error code comes from "podman" and could be cause because the container is not present in the system. That will raise an exception [4] that will re schedule a resync. The DHCP agent will enter in an endless loop unless restarted. That will remove from "self.cache = NetworkCache()" the affected network that is triggering the exception.
Logs DHCP agent (snippet): [4]
Bugzilla reference:
https://bugzilla.redhat.com/show_bug.cgi?id=2032010
[1]https://github.com/openstack/neutron/blob/df9435a9a6fab9492c4f23d9ab0f1507841430c7/neutron/agent/dhcp/agent.py#L413-L426
[2]https://github.com/openstack/neutron/blob/df9435a9a6fab9492c4f23d9ab0f1507841430c7/neutron/agent/linux/dhcp.py#L305-L313
[3]https://github.com/openstack/tripleo-heat-templates/blob/25db32d4e5ed7ed4687bbb6d07a8a87ad65b71e6/deployment/neutron/kill-script
[4]https://paste.opendev.org/show/811802/
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1955491/+subscriptions
Follow ups