← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2101840] Re: ovs jobs randomly failing with guest vms not getting ip from dhcp

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/944826
Committed: https://opendev.org/openstack/neutron/commit/0b7c808185d80a641fb05956a8d2884c79c8be49
Submitter: "Zuul (22348)"
Branch:    master

commit 0b7c808185d80a641fb05956a8d2884c79c8be49
Author: Rodolfo Alonso <ralonsoh@xxxxxxxxxx>
Date:   Tue Mar 18 06:55:54 2025 +0000

    Revert "[eventlet-removal] Remove eventlet from DHCP agent"
    
    This reverts commit 534c2dfdaa839ed7b30d1c2ce6d53fb8954b2be2.
    
    Reason for revert: Reason for revert: the DHCP agent is still using "oslo.service". The
    only backend implemented so far is "eventlet" (see c#2 in LP#2101840).
    This revert is temporary until we have a "oslo.service" release
    using kernel libraries.
    
    [1]https://review.opendev.org/c/openstack/neutron/+/942393
    
    Change-Id: I8bd0580bb830e0317ca6663c67b58842c34b87b4
    Closes-Bug: #2101840


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2101840

Title:
  ovs jobs randomly failing with guest vms not getting ip from dhcp

Status in neutron:
  Fix Released

Bug description:
  Creating a new one then
  https://bugs.launchpad.net/neutron/+bug/2045549 as we seeing this more
  often now so might be related to recent cleanups for eventlet in
  master

  Example failures:-
  - https://6680d19461a57172eaa9-71ec100da363a5820f9bef22bce9fb5d.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-with-os-ken-master/30c29c2/testr_results.html
  - https://fc891d7e1f0b0d1ecc9c-5406e7d03d51190d7ee83b25b2acffe7.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-with-os-ken-master/3d07efc/testr_results.html
  - https://e6aae367021d38c9d4db-1eee5d6654f19bfbb50f61b5642c3fcd.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-with-oslo-master/5db5fe8/testr_results.html
  - https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_682/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-fips/6822cad/testr_results.html
  - https://b2b9d08c5b85917c3495-18cd7f889ae1379763e7ec0a1ec70a8d.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-with-os-ken-master/ae8c73a/testr_results.html
  - https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c6b/periodic/opendev.org/openstack/neutron/master/neutron-ovs-tempest-plugin-iptables_hybrid-nftables/c6b70a9/testr_results.html

  Checking the last failure ^
  Test fails to SSH via floating ip:-
  2025-03-09 02:52:17,690 87771 WARNING  [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.180 ([Errno None] Unable to connect to port 22 on 172.24.5.180). Number attempts: 19. Retry after 20 seconds.
  2025-03-09 02:52:41,242 87771 WARNING  [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.180 ([Errno None] Unable to connect to port 22 on 172.24.5.180). Number attempts: 20. Retry after 21 seconds.
  2025-03-09 02:53:05,817 87771 ERROR    [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.180 after 20 attempts. Proxy client: no proxy client
  2025-03-09 02:53:05.817 87771 ERROR tempest.lib.common.ssh Traceback (most recent call last):
  2025-03-09 02:53:05.817 87771 ERROR tempest.lib.common.ssh   File "/opt/stack/tempest/tempest/lib/common/ssh.py", line 136, in _get_ssh_connection
  2025-03-09 02:53:05.817 87771 ERROR tempest.lib.common.ssh     ssh.connect(self.host, port=self.port, username=self.username,
  2025-03-09 02:53:05.817 87771 ERROR tempest.lib.common.ssh   File "/opt/stack/tempest/.tox/tempest/lib/python3.12/site-packages/paramiko/client.py", line 409, in connect
  2025-03-09 02:53:05.817 87771 ERROR tempest.lib.common.ssh     raise NoValidConnectionsError(errors)
  2025-03-09 02:53:05.817 87771 ERROR tempest.lib.common.ssh paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.24.5.180

  
  From vm console log it didn't got IP from dhcp:-
  ### /etc/init.d/sshd start
  Top of dropbear init script
  Starting dropbear sshd: failed to get instance-id of datasource
  mkdir: can't create directory '/etc/dropbear': No such file or directory
  WARN: generating key of type rsa failed!
  WARN: generating key of type ecdsa failed!
  FAIL
  ### ifconfig -a
  eth0      Link encap:Ethernet  HWaddr FA:16:3E:04:97:83  
            inet6 addr: fe80::f816:3eff:fe04:9783/64 Scope:Link
            UP BROADCAST RUNNING MULTICAST  MTU:1380  Metric:1
            RX packets:117 errors:0 dropped:0 overruns:0 frame:0
            TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
            collisions:0 txqueuelen:1000 
            RX bytes:5654 (5.5 KiB)  TX bytes:1892 (1.8 KiB)

  
  From dhcp agent logs:-
  Mar 09 02:48:00.817622 np0040117219 dnsmasq-dhcp[88108]: DHCPDISCOVER(tap6d8b1e0a-4b) fa:16:3e:04:97:83 no address available
  Mar 09 02:49:00.865746 np0040117219 dnsmasq-dhcp[88108]: DHCPDISCOVER(tap6d8b1e0a-4b) fa:16:3e:04:97:83 no address available
  Mar 09 02:50:00.930199 np0040117219 dnsmasq-dhcp[88108]: DHCPDISCOVER(tap6d8b1e0a-4b) fa:16:3e:04:97:83 no address available
  Mar 09 02:50:16.221733 np0040117219 neutron-dhcp-agent[64177]: DEBUG neutron.agent.dhcp.agent [-] Pending events to be processed: 233 {{(pid=64177) _process_resource_update /opt/stack/neutron/neutron/agent/dhcp/agent.py:597}}
  Mar 09 02:50:16.222351 np0040117219 neutron-dhcp-agent[64177]: DEBUG neutron.agent.dhcp.agent [-] neutron.agent.dhcp.agent.DhcpAgentWithStateReport method _port_create called with arguments (admin_state_up=True, allowed_address_pairs=[], binding:host_id=, binding:profile=, binding:vif_details=, binding:vif_type=unbound, binding:vnic_type=normal, created_at=2025-03-09T02:47:49Z, description=, device_id=93c28b39-3e90-4d4c-8a27-04d04a4224e3, device_owner=, dns_assignment=[fqdn=host-10-1-0-5.openstackgate.local., hostname=host-10-1-0-5, ip_address=10.1.0.5], dns_domain=, dns_name=, extra_dhcp_opts=[], fixed_ips=[ip_address=10.1.0.5, subnet_id=251d60e3-4355-45a3-b0f7-9b99794ad793], id=fdeeaa61-d821-4c28-9a5f-3fa54466002c, ip_allocation=immediate, mac_address=fa:16:3e:04:97:83, name=, network=admin_state_up=True, availability_zone_hints=[], availability_zones=['nova'], created_at=2025-03-09T02:46:12Z, description=, dns_domain=, id=89dcf835-6757-4b37-a434-a790066e5142, ipv4_address_scope=None, ipv6_address_scope=None, l2_adjacency=True, mtu=1380, name=tempest-AttachInterfacesTestJSON-1573214142-network, port_security_enabled=True, project_id=fa0d2441acf841a18af24edd7bc278f4, provider:network_type=vxlan, provider:physical_network=None, provider:segmentation_id=5, qinq=None, qos_policy_id=None, revision_number=2, router:external=False, shared=False, standard_attr_id=63, status=ACTIVE, subnets=['251d60e3-4355-45a3-b0f7-9b99794ad793'], tags=[], tenant_id=fa0d2441acf841a18af24edd7bc278f4, updated_at=2025-03-09T02:46:12Z, vlan_transparent=None, network_id=89dcf835-6757-4b37-a434-a790066e5142, port_security_enabled=True, project_id=fa0d2441acf841a18af24edd7bc278f4, propagate_uplink_status=True, qos_network_policy_id=None, qos_policy_id=None, resource_request=None, revision_number=1, security_groups=['7da7083c-7d1f-4714-ad8e-7092b95ed3f5'], standard_attr_id=701, status=DOWN, tags=[], tenant_id=fa0d2441acf841a18af24edd7bc278f4, trusted=None, updated_at=2025-03-09T02:47:50Z,) {} {{(pid=64177) wrapper /opt/stack/data/venv/lib/python3.12/site-packages/oslo_log/helpers.py:65}}
  Mar 09 02:50:16.222894 np0040117219 neutron-dhcp-agent[64177]: INFO neutron.agent.dhcp.agent [-] Trigger reload_allocations for port admin_state_up=True, allowed_address_pairs=[], binding:host_id=, binding:profile=, binding:vif_details=, binding:vif_type=unbound, binding:vnic_type=normal, created_at=2025-03-09T02:47:49Z, description=, device_id=93c28b39-3e90-4d4c-8a27-04d04a4224e3, device_owner=, dns_assignment=[fqdn=host-10-1-0-5.openstackgate.local., hostname=host-10-1-0-5, ip_address=10.1.0.5], dns_domain=, dns_name=, extra_dhcp_opts=[], fixed_ips=[ip_address=10.1.0.5, subnet_id=251d60e3-4355-45a3-b0f7-9b99794ad793], id=fdeeaa61-d821-4c28-9a5f-3fa54466002c, ip_allocation=immediate, mac_address=fa:16:3e:04:97:83, name=, network=admin_state_up=True, availability_zone_hints=[], availability_zones=['nova'], created_at=2025-03-09T02:46:12Z, description=, dns_domain=, id=89dcf835-6757-4b37-a434-a790066e5142, ipv4_address_scope=None, ipv6_address_scope=None, l2_adjacency=True, mtu=1380, name=tempest-AttachInterfacesTestJSON-1573214142-network, port_security_enabled=True, project_id=fa0d2441acf841a18af24edd7bc278f4, provider:network_type=vxlan, provider:physical_network=None, provider:segmentation_id=5, qinq=None, qos_policy_id=None, revision_number=2, router:external=False, shared=False, standard_attr_id=63, status=ACTIVE, subnets=['251d60e3-4355-45a3-b0f7-9b99794ad793'], tags=[], tenant_id=fa0d2441acf841a18af24edd7bc278f4, updated_at=2025-03-09T02:46:12Z, vlan_transparent=None, network_id=89dcf835-6757-4b37-a434-a790066e5142, port_security_enabled=True, project_id=fa0d2441acf841a18af24edd7bc278f4, propagate_uplink_status=True, qos_network_policy_id=None, qos_policy_id=None, resource_request=None, revision_number=1, security_groups=['7da7083c-7d1f-4714-ad8e-7092b95ed3f5'], standard_attr_id=701, status=DOWN, tags=[], tenant_id=fa0d2441acf841a18af24edd7bc278f4, trusted=None, updated_at=2025-03-09T02:47:50Z on network 89dcf835-6757-4b37-a434-a790066e5142
  Mar 09 02:50:16.223469 np0040117219 neutron-dhcp-agent[64177]: DEBUG neutron.agent.dhcp.agent [-] Calling driver for network: 89dcf835-6757-4b37-a434-a790066e5142/seg=None action: reload_allocations {{(pid=64177) _call_driver /opt/stack/neutron/neutron/agent/dhcp/agent.py:233}}
  Mar 09 02:50:16.224636 np0040117219 neutron-dhcp-agent[64177]: DEBUG neutron.agent.linux.dhcp [-] Building host file: /opt/stack/data/neutron/dhcp/89dcf835-6757-4b37-a434-a790066e5142/host {{(pid=64177) _output_hosts_file /opt/stack/neutron/neutron/agent/linux/dhcp.py:944}}
  Mar 09 02:50:16.225296 np0040117219 neutron-dhcp-agent[64177]: DEBUG neutron.agent.linux.dhcp [-] Done building host file /opt/stack/data/neutron/dhcp/89dcf835-6757-4b37-a434-a790066e5142/host {{(pid=64177) _output_hosts_file /opt/stack/neutron/neutron/agent/linux/dhcp.py:985}}
  Mar 09 02:50:16.738386 np0040117219 dnsmasq[88108]: read /opt/stack/data/neutron/dhcp/89dcf835-6757-4b37-a434-a790066e5142/addn_hosts - 4 names

  Seeing high number of events pending, seems that causing such issues?

  May be related to recent changes related to eventlet cleanup like
  https://review.opendev.org/c/openstack/neutron/+/942393

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2101840/+subscriptions



References