← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2045549] [NEW] OVS jobs randomly fails as Guest VMs not(or delayed) configured with DHCP

 

Public bug reported:

Seen couple of hits recently, Tests fails as:-
Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
    ssh.connect(self.host, port=self.port, username=self.username,
  File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/paramiko/client.py", line 409, in connect
    raise NoValidConnectionsError(errors)
paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.24.5.37

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
    return f(*func_args, **func_kwargs)
  File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 278, in test_create_list_show_delete_interfaces_by_fixed_ip
    server, ifs, _ = self._create_server_get_interfaces()
  File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 88, in _create_server_get_interfaces
    self._wait_for_validation(server, validation_resources)
  File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 73, in _wait_for_validation
    linux_client.validate_authentication()
  File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
    return function(self, *args, **kwargs)
  File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 123, in validate_authentication
    self.ssh_client.test_connection_auth()
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 245, in test_connection_auth
    connection = self._get_ssh_connection()
  File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
    raise exceptions.SSHTimeout(host=self.host,
tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.37 via SSH timed out.
User: cirros, Password: password

>From Console logs of the vm, eth0 interface is not configured correctly as it have IPv4LL address instead:-
Nov 25 08:44:45 cirros daemon.info dhcpcd[316]: eth0: using IPv4LL address 169.254.203.238

>From DHCP aggent logs, there were DHCPDISCOVER/DHCPOFFER but no DHCPREQUEST/DHCPACK
Nov 25 08:44:35.898432 np0035864589 dnsmasq-dhcp[98241]: DHCPDISCOVER(tapb331ed5f-9e) fa:16:3e:fa:8b:d7
Nov 25 08:44:35.898464 np0035864589 dnsmasq-dhcp[98241]: DHCPOFFER(tapb331ed5f-9e) 10.1.0.14 fa:16:3e:fa:8b:d7

For in other failures it was seen differently like dhcp took time to configure and in meanwhile metadata failed(failed 20/20: up 56.38. request failed):-
Nov 22 06:08:28.142795 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f no address available
Nov 22 06:08:33.071307 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f no address available
Nov 22 06:08:42.063921 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f no address available
Nov 22 06:09:29.752568 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f
Nov 22 06:09:29.752593 np0035837630 dnsmasq-dhcp[104598]: DHCPOFFER(tap2f7f2c03-6d) 10.1.0.26 fa:16:3e:10:67:5f
Nov 22 06:09:29.756191 np0035837630 dnsmasq-dhcp[104598]: DHCPREQUEST(tap2f7f2c03-6d) 10.1.0.26 fa:16:3e:10:67:5f
Nov 22 06:09:29.756218 np0035837630 dnsmasq-dhcp[104598]: DHCPACK(tap2f7f2c03-6d) 10.1.0.26 fa:16:3e:10:67:5f tempest-server-test-761246100


Example builds:-
https://zuul.openstack.org/build/3e27967397a64fb587d8aae2ff215d10
https://zuul.openstack.org/build/fb288858b1e14d4893bc3710abca38d8
https://zuul.openstack.org/build/3f16434fef6a4ae5846bd11d49aab8ad
https://zuul.openstack.org/build/b7722266e9134fb2ae38484d284bacd3
https://zuul.openstack.org/build/abd7aada5fd84371a78182effb7f0050

Opensearch:-
https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22:%20using%20IPv4LL%20address%22'),sort:!())

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2045549

Title:
  OVS jobs randomly fails as Guest VMs not(or delayed) configured with
  DHCP

Status in neutron:
  New

Bug description:
  Seen couple of hits recently, Tests fails as:-
  Traceback (most recent call last):
    File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
      ssh.connect(self.host, port=self.port, username=self.username,
    File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/paramiko/client.py", line 409, in connect
      raise NoValidConnectionsError(errors)
  paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.24.5.37

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper
      return f(*func_args, **func_kwargs)
    File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 278, in test_create_list_show_delete_interfaces_by_fixed_ip
      server, ifs, _ = self._create_server_get_interfaces()
    File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 88, in _create_server_get_interfaces
      self._wait_for_validation(server, validation_resources)
    File "/opt/stack/tempest/tempest/api/compute/servers/test_attach_interfaces.py", line 73, in _wait_for_validation
      linux_client.validate_authentication()
    File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
      return function(self, *args, **kwargs)
    File "/opt/stack/tempest/tempest/lib/common/utils/linux/remote_client.py", line 123, in validate_authentication
      self.ssh_client.test_connection_auth()
    File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 245, in test_connection_auth
      connection = self._get_ssh_connection()
    File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 155, in _get_ssh_connection
      raise exceptions.SSHTimeout(host=self.host,
  tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.37 via SSH timed out.
  User: cirros, Password: password

  From Console logs of the vm, eth0 interface is not configured correctly as it have IPv4LL address instead:-
  Nov 25 08:44:45 cirros daemon.info dhcpcd[316]: eth0: using IPv4LL address 169.254.203.238

  From DHCP aggent logs, there were DHCPDISCOVER/DHCPOFFER but no DHCPREQUEST/DHCPACK
  Nov 25 08:44:35.898432 np0035864589 dnsmasq-dhcp[98241]: DHCPDISCOVER(tapb331ed5f-9e) fa:16:3e:fa:8b:d7
  Nov 25 08:44:35.898464 np0035864589 dnsmasq-dhcp[98241]: DHCPOFFER(tapb331ed5f-9e) 10.1.0.14 fa:16:3e:fa:8b:d7

  For in other failures it was seen differently like dhcp took time to configure and in meanwhile metadata failed(failed 20/20: up 56.38. request failed):-
  Nov 22 06:08:28.142795 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f no address available
  Nov 22 06:08:33.071307 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f no address available
  Nov 22 06:08:42.063921 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f no address available
  Nov 22 06:09:29.752568 np0035837630 dnsmasq-dhcp[104598]: DHCPDISCOVER(tap2f7f2c03-6d) fa:16:3e:10:67:5f
  Nov 22 06:09:29.752593 np0035837630 dnsmasq-dhcp[104598]: DHCPOFFER(tap2f7f2c03-6d) 10.1.0.26 fa:16:3e:10:67:5f
  Nov 22 06:09:29.756191 np0035837630 dnsmasq-dhcp[104598]: DHCPREQUEST(tap2f7f2c03-6d) 10.1.0.26 fa:16:3e:10:67:5f
  Nov 22 06:09:29.756218 np0035837630 dnsmasq-dhcp[104598]: DHCPACK(tap2f7f2c03-6d) 10.1.0.26 fa:16:3e:10:67:5f tempest-server-test-761246100

  
  Example builds:-
  https://zuul.openstack.org/build/3e27967397a64fb587d8aae2ff215d10
  https://zuul.openstack.org/build/fb288858b1e14d4893bc3710abca38d8
  https://zuul.openstack.org/build/3f16434fef6a4ae5846bd11d49aab8ad
  https://zuul.openstack.org/build/b7722266e9134fb2ae38484d284bacd3
  https://zuul.openstack.org/build/abd7aada5fd84371a78182effb7f0050

  Opensearch:-
  https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22:%20using%20IPv4LL%20address%22'),sort:!())

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2045549/+subscriptions