yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #92460
[Bug 2022321] [NEW] Using Isolated metadata+ipv6 haproxy metadata isn't working becasue haproxy container isn't created in some controlers
Public bug reported:
Keys and metadata info isn't loaded in the vms:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
ssh.connect(self.host, port=self.port, username=self.username,
File "/usr/lib/python3.9/site-packages/paramiko/client.py", line 406, in connect
t.start_client(timeout=timeout)
File "/usr/lib/python3.9/site-packages/paramiko/transport.py", line 699, in start_client
raise e
File "/usr/lib/python3.9/site-packages/paramiko/transport.py", line 2110, in run
ptype, m = self.packetizer.read_message()
File "/usr/lib/python3.9/site-packages/paramiko/packet.py", line 459, in read_message
header = self.read_all(self.__block_size_in, check_rekey=True)
File "/usr/lib/python3.9/site-packages/paramiko/packet.py", line 303, in read_all
raise EOFError()
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
return f(*func_args, **func_kwargs)
File "/usr/lib/python3.9/site-packages/tempest/scenario/test_network_basic_ops.py", line 535, in test_hotplug_nic
self._check_public_network_connectivity(should_connect=True)
File "/usr/lib/python3.9/site-packages/tempest/scenario/test_network_basic_ops.py", line 212, in _check_public_network_connectivity
self.check_vm_connectivity(
File "/usr/lib/python3.9/site-packages/tempest/scenario/manager.py", line 964, in check_vm_connectivity
self.get_remote_client(ip_address, username, private_key,
File "/usr/lib/python3.9/site-packages/tempest/scenario/manager.py", line 733, in get_remote_client
linux_client.validate_authentication()
File "/usr/lib/python3.9/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
return function(self, *args, **kwargs)
File "/usr/lib/python3.9/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 123, in validate_authentication
self.ssh_client.test_connection_auth()
File "/usr/lib/python3.9/site-packages/tempest/lib/common/ssh.py", line 245, in test_connection_auth
connection = self._get_ssh_connection()
File "/usr/lib/python3.9/site-packages/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
raise exceptions.SSHTimeout(host=self.host,
tempest.lib.exceptions.SSHTimeout: Connection to the 10.0.0.190 via SSH timed out.
User: cirros, Password: None
The trigger of the problem is this patch:
https://review.opendev.org/c/openstack/neutron/+/876566/13/neutron/agent/metadata/driver.py
when Dad ipv6 error is detected haproxy isn't created due to the return in the line 269:
..........
'namespace': ns_name,
'network': network_id,
'exception': str(exc)})
try:
ip_lib.delete_ip_address(bind_address_v6, bind_interface,
namespace=ns_name)
except Exception as exc:
# do not re-raise a delete failure, just log
LOG.info('Address deletion failure: %s', str(exc))
return
pm.enable()
.........
The problem needs that Dad error was detected in the controller is reported as metadata source because in this case without haproxy in this controller the metadata is unreachbable:
Dad error:
2023-05-31 14:27:40.140 79551 INFO neutron.agent.metadata.driver
[req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] DAD failed for
address fe80::a9fe:a9fe on interface tapb07b4b7c-3b in namespace qdhcp-
abd16487-68bb-4090-8ccb-b6ec8a77cc2c on network
abd16487-68bb-4090-8ccb-b6ec8a77cc2c, deleting it. Exception: Failure
waiting for address fe80::a9fe:a9fe to become ready: Duplicate address
detected
haproxy doesn't start:
2023-05-31 14:27:39.461 79551 DEBUG neutron.agent.linux.utils [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
2023-05-31 14:27:39.462 79551 DEBUG neutron.agent.linux.utils [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
2023-05-31 14:27:39.463 79551 DEBUG neutron.agent.linux.external_process [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] No haproxy process started for abd16487-68bb-4090-8ccb-b6ec8a77cc2c disable /usr/lib/python3.9/site-packages/neutron/agent/linux/external_process.py:125
2023-05-31 14:27:39.463 79551 DEBUG neutron.agent.linux.utils [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
controller metadata ip :
ent': 'RTM_NEWADDR'}, {'family': 2, 'prefixlen': 28, 'flags': 128, 'scope': 0, 'index': 490, 'attrs': [['IFA_ADDRESS', '10.100.0.3'], ['IFA_LOCAL', '10.100.0.3'], ['IFA_BROADCAST', '10.100.0.15'], ['IFA_LABEL', 'tapb07b4b7c-3b'], ['IFA_FLAGS', 128], ['IFA_CACHEINFO', {'ifa_preferred': 4294967295, 'ifa_valid': 4294967295, 'cstamp': 815201, 'tstamp': 815201}]], 'header': {'length': 96, 'type': 20, 'flags': 2, 'sequence_number': 255, 'pid': 699746, 'error': None, 'target': 'qdhcp-abd16487-68bb-4090-8ccb-b6ec8a77cc2c', 'stats': (0, 0, 0)},
Error in vm : "ip-route:169.254.169.254 via 10.100.0.3 dev eth0 "
failed 14/20: up 34.50. request failed
failed 15/20: up 36.51. request failed
failed 16/20: up 38.53. request failed
failed 17/20: up 40.54. request failed
failed 18/20: up 42.56. request failed
failed 19/20: up 44.57. request failed
failed 20/20: up 46.59. request failed
failed to read iid from metadata. tried 20
failed to get instance-id of datasource
Top of dropbear init script
Starting dropbear sshd: failed to get instance-id of datasource
mkdir: can't create directory '/etc/dropbear': No such file or directory
WARN: generating key of type rsa failed!
WARN: generating key of type ecdsa failed!
OK
GROWROOT: CHANGED: partition=1 start=18432 old: size=210911 end=229343 new: size=2078687,end=2097119
/dev/root resized successfully [took 0.03s]
=== system information ===
Platform: Red Hat OpenStack Compute/RHEL
Container: none
Arch: x86_64
CPU(s): 1 @ 2199.996 MHz
Cores/Sockets/Threads: 1/1/1
Virt-type: VT-x
RAM Size: 100MB
Disks:
NAME MAJ:MIN SIZE LABEL MOUNTPOINT
vda 252:0 1073741824
vda1 252:1 1064287744 cirros-rootfs /
vda15 252:15 8388608
=== sshd host keys ===
-----BEGIN SSH HOST KEY KEYS-----
Failed reading '/etc/dropbear/dropbear_rsa_host_key'
Failed reading '/etc/dropbear/dropbear_ecdsa_host_key'
-----END SSH HOST KEY KEYS-----
=== network info ===
if-info: lo,up,127.0.0.1,8,,
if-info: eth0,up,10.100.0.10,28,fe80::f816:3eff:fe6b:2f7a/64,
ip-route:default via 10.100.0.1 dev eth0
ip-route:10.100.0.0/28 dev eth0 scope link src 10.100.0.10
ip-route:169.254.169.254 via 10.100.0.3 dev eth0
ip-route6:fe80::/64 dev eth0 metric 256
ip-route6:ff00::/8 dev eth0 metric 256
=== datasource: None None ===
=== cirros: current=0.5.2 uptime=49.31 ===
____ ____ ____
/ __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/ /_/ \____/___/
http://cirros-cloud.net
Haproxy started only in other contoller:
2023-05-31 14:27:38.844 81096 DEBUG neutron.agent.linux.utils [req-fbdd788d-0c45-4eaf-8f4c-43d3cf32c511 - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
2023-05-31 14:27:38.846 81096 DEBUG neutron.agent.metadata.driver [req-fbdd788d-0c45-4eaf-8f4c-43d3cf32c511 - - - - -] haproxy_cfg =
global
log /dev/log local0 debug
log-tag haproxy-metadata-proxy-abd16487-68bb-4090-8ccb-b6ec8a77cc2c
user neutron
group neutron
maxconn 1024
pidfile /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
option http-server-close
option forwardfor
retries 3
timeout http-request 30s
timeout connect 30s
timeout client 32s
timeout server 32s
timeout http-keep-alive 30s
listen listener
bind 169.254.169.254:80
bind fe80::a9fe:a9fe:80 interface tap6834d3d5-02
server metadata /var/lib/neutron/metadata_proxy
http-request del-header X-Neutron-Router-ID
http-request set-header X-Neutron-Network-ID abd16487-68bb-4090-8ccb-b6ec8a77cc2c
create_config_file /usr/lib/python3.9/site-packages/neutron/agent/metadata/driver.py:162
2023-05-31 14:27:38.847 81096 DEBUG neutron.agent.linux.utils [req-fbdd788d-0c45-4eaf-8f4c-43d3cf32c511 - - - - -] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qdhcp-abd16487-68bb-4090-8ccb-b6ec8a77cc2c', 'haproxy', '-f', '/var/lib/neutron/ns-metadata-proxy/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.conf'] execute_rootwrap_daemon /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:108
** Affects: neutron
Importance: Undecided
Assignee: Rodolfo Alonso (rodolfo-alonso-hernandez)
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2022321
Title:
Using Isolated metadata+ipv6 haproxy metadata isn't working becasue
haproxy container isn't created in some controlers
Status in neutron:
New
Bug description:
Keys and metadata info isn't loaded in the vms:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
ssh.connect(self.host, port=self.port, username=self.username,
File "/usr/lib/python3.9/site-packages/paramiko/client.py", line 406, in connect
t.start_client(timeout=timeout)
File "/usr/lib/python3.9/site-packages/paramiko/transport.py", line 699, in start_client
raise e
File "/usr/lib/python3.9/site-packages/paramiko/transport.py", line 2110, in run
ptype, m = self.packetizer.read_message()
File "/usr/lib/python3.9/site-packages/paramiko/packet.py", line 459, in read_message
header = self.read_all(self.__block_size_in, check_rekey=True)
File "/usr/lib/python3.9/site-packages/paramiko/packet.py", line 303, in read_all
raise EOFError()
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
return f(*func_args, **func_kwargs)
File "/usr/lib/python3.9/site-packages/tempest/scenario/test_network_basic_ops.py", line 535, in test_hotplug_nic
self._check_public_network_connectivity(should_connect=True)
File "/usr/lib/python3.9/site-packages/tempest/scenario/test_network_basic_ops.py", line 212, in _check_public_network_connectivity
self.check_vm_connectivity(
File "/usr/lib/python3.9/site-packages/tempest/scenario/manager.py", line 964, in check_vm_connectivity
self.get_remote_client(ip_address, username, private_key,
File "/usr/lib/python3.9/site-packages/tempest/scenario/manager.py", line 733, in get_remote_client
linux_client.validate_authentication()
File "/usr/lib/python3.9/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
return function(self, *args, **kwargs)
File "/usr/lib/python3.9/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 123, in validate_authentication
self.ssh_client.test_connection_auth()
File "/usr/lib/python3.9/site-packages/tempest/lib/common/ssh.py", line 245, in test_connection_auth
connection = self._get_ssh_connection()
File "/usr/lib/python3.9/site-packages/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
raise exceptions.SSHTimeout(host=self.host,
tempest.lib.exceptions.SSHTimeout: Connection to the 10.0.0.190 via SSH timed out.
User: cirros, Password: None
The trigger of the problem is this patch:
https://review.opendev.org/c/openstack/neutron/+/876566/13/neutron/agent/metadata/driver.py
when Dad ipv6 error is detected haproxy isn't created due to the return in the line 269:
..........
'namespace': ns_name,
'network': network_id,
'exception': str(exc)})
try:
ip_lib.delete_ip_address(bind_address_v6, bind_interface,
namespace=ns_name)
except Exception as exc:
# do not re-raise a delete failure, just log
LOG.info('Address deletion failure: %s', str(exc))
return
pm.enable()
.........
The problem needs that Dad error was detected in the controller is reported as metadata source because in this case without haproxy in this controller the metadata is unreachbable:
Dad error:
2023-05-31 14:27:40.140 79551 INFO neutron.agent.metadata.driver
[req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] DAD failed for
address fe80::a9fe:a9fe on interface tapb07b4b7c-3b in namespace
qdhcp-abd16487-68bb-4090-8ccb-b6ec8a77cc2c on network
abd16487-68bb-4090-8ccb-b6ec8a77cc2c, deleting it. Exception: Failure
waiting for address fe80::a9fe:a9fe to become ready: Duplicate address
detected
haproxy doesn't start:
2023-05-31 14:27:39.461 79551 DEBUG neutron.agent.linux.utils [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
2023-05-31 14:27:39.462 79551 DEBUG neutron.agent.linux.utils [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
2023-05-31 14:27:39.463 79551 DEBUG neutron.agent.linux.external_process [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] No haproxy process started for abd16487-68bb-4090-8ccb-b6ec8a77cc2c disable /usr/lib/python3.9/site-packages/neutron/agent/linux/external_process.py:125
2023-05-31 14:27:39.463 79551 DEBUG neutron.agent.linux.utils [req-a76cfcdd-887b-4c36-86d5-a5eb2b87615c - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
controller metadata ip :
ent': 'RTM_NEWADDR'}, {'family': 2, 'prefixlen': 28, 'flags': 128, 'scope': 0, 'index': 490, 'attrs': [['IFA_ADDRESS', '10.100.0.3'], ['IFA_LOCAL', '10.100.0.3'], ['IFA_BROADCAST', '10.100.0.15'], ['IFA_LABEL', 'tapb07b4b7c-3b'], ['IFA_FLAGS', 128], ['IFA_CACHEINFO', {'ifa_preferred': 4294967295, 'ifa_valid': 4294967295, 'cstamp': 815201, 'tstamp': 815201}]], 'header': {'length': 96, 'type': 20, 'flags': 2, 'sequence_number': 255, 'pid': 699746, 'error': None, 'target': 'qdhcp-abd16487-68bb-4090-8ccb-b6ec8a77cc2c', 'stats': (0, 0, 0)},
Error in vm : "ip-route:169.254.169.254 via 10.100.0.3 dev eth0 "
failed 14/20: up 34.50. request failed
failed 15/20: up 36.51. request failed
failed 16/20: up 38.53. request failed
failed 17/20: up 40.54. request failed
failed 18/20: up 42.56. request failed
failed 19/20: up 44.57. request failed
failed 20/20: up 46.59. request failed
failed to read iid from metadata. tried 20
failed to get instance-id of datasource
Top of dropbear init script
Starting dropbear sshd: failed to get instance-id of datasource
mkdir: can't create directory '/etc/dropbear': No such file or directory
WARN: generating key of type rsa failed!
WARN: generating key of type ecdsa failed!
OK
GROWROOT: CHANGED: partition=1 start=18432 old: size=210911 end=229343 new: size=2078687,end=2097119
/dev/root resized successfully [took 0.03s]
=== system information ===
Platform: Red Hat OpenStack Compute/RHEL
Container: none
Arch: x86_64
CPU(s): 1 @ 2199.996 MHz
Cores/Sockets/Threads: 1/1/1
Virt-type: VT-x
RAM Size: 100MB
Disks:
NAME MAJ:MIN SIZE LABEL MOUNTPOINT
vda 252:0 1073741824
vda1 252:1 1064287744 cirros-rootfs /
vda15 252:15 8388608
=== sshd host keys ===
-----BEGIN SSH HOST KEY KEYS-----
Failed reading '/etc/dropbear/dropbear_rsa_host_key'
Failed reading '/etc/dropbear/dropbear_ecdsa_host_key'
-----END SSH HOST KEY KEYS-----
=== network info ===
if-info: lo,up,127.0.0.1,8,,
if-info: eth0,up,10.100.0.10,28,fe80::f816:3eff:fe6b:2f7a/64,
ip-route:default via 10.100.0.1 dev eth0
ip-route:10.100.0.0/28 dev eth0 scope link src 10.100.0.10
ip-route:169.254.169.254 via 10.100.0.3 dev eth0
ip-route6:fe80::/64 dev eth0 metric 256
ip-route6:ff00::/8 dev eth0 metric 256
=== datasource: None None ===
=== cirros: current=0.5.2 uptime=49.31 ===
____ ____ ____
/ __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/ /_/ \____/___/
http://cirros-cloud.net
Haproxy started only in other contoller:
2023-05-31 14:27:38.844 81096 DEBUG neutron.agent.linux.utils [req-fbdd788d-0c45-4eaf-8f4c-43d3cf32c511 - - - - -] Unable to access /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy; Error: [Errno 2] No such file or directory: '/var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy' get_value_from_file /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:252
2023-05-31 14:27:38.846 81096 DEBUG neutron.agent.metadata.driver [req-fbdd788d-0c45-4eaf-8f4c-43d3cf32c511 - - - - -] haproxy_cfg =
global
log /dev/log local0 debug
log-tag haproxy-metadata-proxy-abd16487-68bb-4090-8ccb-b6ec8a77cc2c
user neutron
group neutron
maxconn 1024
pidfile /var/lib/neutron/external/pids/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.pid.haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
option http-server-close
option forwardfor
retries 3
timeout http-request 30s
timeout connect 30s
timeout client 32s
timeout server 32s
timeout http-keep-alive 30s
listen listener
bind 169.254.169.254:80
bind fe80::a9fe:a9fe:80 interface tap6834d3d5-02
server metadata /var/lib/neutron/metadata_proxy
http-request del-header X-Neutron-Router-ID
http-request set-header X-Neutron-Network-ID abd16487-68bb-4090-8ccb-b6ec8a77cc2c
create_config_file /usr/lib/python3.9/site-packages/neutron/agent/metadata/driver.py:162
2023-05-31 14:27:38.847 81096 DEBUG neutron.agent.linux.utils [req-fbdd788d-0c45-4eaf-8f4c-43d3cf32c511 - - - - -] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qdhcp-abd16487-68bb-4090-8ccb-b6ec8a77cc2c', 'haproxy', '-f', '/var/lib/neutron/ns-metadata-proxy/abd16487-68bb-4090-8ccb-b6ec8a77cc2c.conf'] execute_rootwrap_daemon /usr/lib/python3.9/site-packages/neutron/agent/linux/utils.py:108
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2022321/+subscriptions