yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #81653
[Bug 1863858] [NEW] socket.timeout error in dvr CI jobs cause SSH issues
Public bug reported:
It happens mostly in neutron-tempest-dvr job that random tests are
failing due to problems with SSH to the instance. Error is always like:
2020-02-18 18:24:34,987 22897 INFO [tempest.lib.common.ssh] Creating ssh connection to '172.24.5.96:22' as 'cirros' with public key authentication
2020-02-18 18:25:35,048 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 1. Retry after 2 seconds.
2020-02-18 18:26:37,609 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 2. Retry after 3 seconds.
2020-02-18 18:27:41,173 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 3. Retry after 4 seconds.
2020-02-18 18:28:45,701 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 4. Retry after 5 seconds.
2020-02-18 18:29:51,265 22897 ERROR [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 after 4 attempts
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh Traceback (most recent call last):
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 107, in _get_ssh_connection
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh sock=proxy_chan)
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in connect
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh retry_on_signal(lambda: sock.connect(addr))
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/util.py", line 283, in retry_on_signal
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh return function()
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in <lambda>
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh retry_on_signal(lambda: sock.connect(addr))
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh socket.timeout: timed out
And then at the end of the test:
Traceback (most recent call last):
File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 107, in _get_ssh_connection
sock=proxy_chan)
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in connect
retry_on_signal(lambda: sock.connect(addr))
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/util.py", line 283, in retry_on_signal
return function()
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in <lambda>
retry_on_signal(lambda: sock.connect(addr))
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper
return f(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 229, in test_create_list_show_delete_interfaces_by_network_port
server, ifs = self._create_server_get_interfaces()
File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 88, in _create_server_get_interfaces
self._wait_for_validation(server, validation_resources)
File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 73, in _wait_for_validation
linux_client.validate_authentication()
File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 60, in wrapper
six.reraise(*original_exception)
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 33, in wrapper
return function(self, *args, **kwargs)
File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 116, in validate_authentication
self.ssh_client.test_connection_auth()
File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 209, in test_connection_auth
connection = self._get_ssh_connection()
File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 121, in _get_ssh_connection
password=self.password)
tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.96 via SSH timed out.
User: cirros, Password: password
>From console log it seems that fixed IP was properly configured on the instance and metadata service worked fine too.
** Affects: neutron
Importance: Critical
Status: Confirmed
** Tags: gate-failure l3-dvr-backlog tempest
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1863858
Title:
socket.timeout error in dvr CI jobs cause SSH issues
Status in neutron:
Confirmed
Bug description:
It happens mostly in neutron-tempest-dvr job that random tests are
failing due to problems with SSH to the instance. Error is always
like:
2020-02-18 18:24:34,987 22897 INFO [tempest.lib.common.ssh] Creating ssh connection to '172.24.5.96:22' as 'cirros' with public key authentication
2020-02-18 18:25:35,048 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 1. Retry after 2 seconds.
2020-02-18 18:26:37,609 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 2. Retry after 3 seconds.
2020-02-18 18:27:41,173 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 3. Retry after 4 seconds.
2020-02-18 18:28:45,701 22897 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 (timed out). Number attempts: 4. Retry after 5 seconds.
2020-02-18 18:29:51,265 22897 ERROR [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.96 after 4 attempts
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh Traceback (most recent call last):
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 107, in _get_ssh_connection
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh sock=proxy_chan)
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in connect
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh retry_on_signal(lambda: sock.connect(addr))
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/util.py", line 283, in retry_on_signal
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh return function()
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in <lambda>
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh retry_on_signal(lambda: sock.connect(addr))
2020-02-18 18:29:51.265 22897 ERROR tempest.lib.common.ssh socket.timeout: timed out
And then at the end of the test:
Traceback (most recent call last):
File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 107, in _get_ssh_connection
sock=proxy_chan)
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in connect
retry_on_signal(lambda: sock.connect(addr))
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/util.py", line 283, in retry_on_signal
return function()
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/paramiko/client.py", line 349, in <lambda>
retry_on_signal(lambda: sock.connect(addr))
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper
return f(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 229, in test_create_list_show_delete_interfaces_by_network_port
server, ifs = self._create_server_get_interfaces()
File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 88, in _create_server_get_interfaces
self._wait_for_validation(server, validation_resources)
File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 73, in _wait_for_validation
linux_client.validate_authentication()
File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 60, in wrapper
six.reraise(*original_exception)
File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 33, in wrapper
return function(self, *args, **kwargs)
File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 116, in validate_authentication
self.ssh_client.test_connection_auth()
File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 209, in test_connection_auth
connection = self._get_ssh_connection()
File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 121, in _get_ssh_connection
password=self.password)
tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.96 via SSH timed out.
User: cirros, Password: password
From console log it seems that fixed IP was properly configured on the instance and metadata service worked fine too.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1863858/+subscriptions