yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89398
[Bug 1980967] Re: get_hypervisor_hostname helper function is failing silently
Reviewed: https://review.opendev.org/c/openstack/neutron/+/849122
Committed: https://opendev.org/openstack/neutron/commit/ea223072841adc3fb88b840b5f8018bff60c8aa7
Submitter: "Zuul (22348)"
Branch: master
commit ea223072841adc3fb88b840b5f8018bff60c8aa7
Author: Miro Tomaska <mtomaska@xxxxxxxxxx>
Date: Fri Jul 8 09:56:23 2022 -0500
Add workaround for eventlet.greendns bug
Issue[1] workaround: A wrapper class which determines if socket module
was eventlet patched and request std lib socket module instead.
Also adding LOG.warning into the exception block so we dont miss
issues like this in the future.
Closes-Bug: #1980967
Related-Bug: #1926693
[1]https://github.com/eventlet/eventlet/issues/764
Change-Id: I41c4cbc1aaea95f7808e6c6dca47ecd0402351c9
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1980967
Title:
get_hypervisor_hostname helper function is failing silently
Status in neutron:
Fix Released
Bug description:
get_hypervisor_hostname() is raising an error but error is squashed[1] with 'pass'. This results with not getting a fully qualified domain name (fqdn).
I have only seen this issue happen with srio-agent.
Steps to Reproduce:
1.Start a srio-agent container with following sriov_agent.ini
[sriov_nic]
physical_device_mappings=datacentre:enp7s0f3,datacentre:enp5s0f0
resource_provider_bandwidths=enp7s0f3:10000000:10000000,enp5s0f0:10000000:10000000
2. Observe srio_agent log and notice that the agent starts without
fqdn
INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [-]
Resource provider hypervisors: {'enp7s0f3': 'computesriov-1',
'enp5s0f0': 'computesriov-1'}
Additional info:
I have root caused it by logging traceback and IOError.
2022-07-05 20:29:20.450 122133 DEBUG neutron.agent.common.utils [-] MIRO got error [Errno -2] Name or service not known get_hypervisor_hostname /usr/lib/python3.9/site-packages/neutron/agent/common/utils.py:104
2022-07-05 20:29:20.452 122133 DEBUG neutron.agent.common.utils [-] format_exc Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/eventlet/support/greendns.py", line 440, in resolve_cname
ans = resolver.query(host, dns.rdatatype.CNAME)
File "/usr/lib/python3.9/site-packages/eventlet/support/greendns.py", line 380, in query
return end()
File "/usr/lib/python3.9/site-packages/eventlet/support/greendns.py", line 359, in end
raise result[1]
File "/usr/lib/python3.9/site-packages/eventlet/support/greendns.py", line 340, in step
a = fun(*args, **kwargs)
File "/usr/lib/python3.9/site-packages/eventlet/dns/resolver.py", line 1002, in query
raise NXDOMAIN(qnames=qnames_to_try, responses=nxdomain_responses)
eventlet.dns.resolver.NXDOMAIN: The DNS query name does not exist: computesriov-0.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/neutron/agent/common/utils.py", line 91, in get_hypervisor_hostname
addrinfo = socket.getaddrinfo(host=hypervisor_hostname,
File "/usr/lib/python3.9/site-packages/eventlet/support/greendns.py", line 540, in getaddrinfo
qname = resolve_cname(qname).encode('ascii').decode('idna')
File "/usr/lib/python3.9/site-packages/eventlet/support/greendns.py", line 446, in resolve_cname
raise EAI_NODATA_ERROR
socket.gaierror: [Errno -2] Name or service not known
What is happening is that 'eventlet' module is doing some import
patching[2] causing socket.getattrinfo() to actually call into the
greendns.py:getattrinfo()[3] instead of python standard library
_socket. The greendns.py:getattrinfo is buggy and it seems to ignore
looking up fqdn in /etc/hosts (contains fqdn on the machine) first and
goes straight to querying DNS server which might not have this info
(as in this case).
This also explains why socket.getattrinfo() works just fine when in
Python terminal but fails when called within srio_agent python code.
[1] https://github.com/openstack/neutron/blob/ae87995a0827c98502bfa29a9abf9e3f229aac72/neutron/agent/common/utils.py#L84-L95
[2] https://github.com/eventlet/eventlet/blob/v0.30.3/eventlet/support/greendns.py#L58
[3] https://github.com/eventlet/eventlet/blob/v0.30.3/eventlet/support/greendns.py#L508
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1980967/+subscriptions
References