yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #81517
[Bug 1862315] [NEW] Sometimes VMs can't get IP when spawned concurrently
Public bug reported:
Version: Stein
Scenario description:
Rally creates 60 VMs with 6 threads. Each thread:
- creates a VM
- pings it
- if successful ping, tries to reach the VM via ssh and execute a command. It tries to do that during 2 minutes.
- if successful ssh - deletes the VM
For some VMs ping fails. Console log shows that VM failed to get IP from
DHCP.
tcpdump on corresponding DHCP port shows VM's DHCP requests, but dnsmasq does not reply.
>From dnsmasq logs:
Feb 6 00:15:43 dnsmasq[4175]: read /var/lib/neutron/dhcp/da73026e-09b9-4f8d-bbdd-84d89c2487b2/addn_hosts - 28 addresses
Feb 6 00:15:43 dnsmasq[4175]: duplicate dhcp-host IP address 10.2.0.194 at line 28 of /var/lib/neutron/dhcp/da73026e-09b9-4f8d-bbdd-84d89c2487b2/host
So it must be something wrong with neutron-dhcp-agent network cache.
>From neutron-dhcp-agent log:
2020-02-06 00:15:20.282 40 DEBUG neutron.agent.dhcp.agent [req-f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] Resync event has been scheduled _periodic_resync_helper /var/lib/openstack/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py:276
2020-02-06 00:15:20.282 40 DEBUG neutron.common.utils [req-f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] Calling throttled function clear wrapper /var/lib/openstack/lib/python3.6/site-packages/neutron/common/utils.py:102
2020-02-06 00:15:20.283 40 DEBUG neutron.agent.dhcp.agent [req-f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] resync (da73026e-09b9-4f8d-bbdd-84d89c2487b2): ['Duplicate IP addresses found, DHCP cache is out of sync'] _periodic_resync_helper /var/lib/openstack/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py:293
so the agent is aware of invalid cache for the net, but for unknown
reason actual net resync happens only in 8 minutes:
2020-02-06 00:23:55.297 40 INFO neutron.agent.dhcp.agent [req-f5107bdd-
d53a-4171-a283-de3d7cf7c708 - - - - -] Synchronizing state
** Affects: neutron
Importance: High
Assignee: Oleg Bondarev (obondarev)
Status: New
** Tags: l3-ipam-dhcp
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1862315
Title:
Sometimes VMs can't get IP when spawned concurrently
Status in neutron:
New
Bug description:
Version: Stein
Scenario description:
Rally creates 60 VMs with 6 threads. Each thread:
- creates a VM
- pings it
- if successful ping, tries to reach the VM via ssh and execute a command. It tries to do that during 2 minutes.
- if successful ssh - deletes the VM
For some VMs ping fails. Console log shows that VM failed to get IP
from DHCP.
tcpdump on corresponding DHCP port shows VM's DHCP requests, but dnsmasq does not reply.
From dnsmasq logs:
Feb 6 00:15:43 dnsmasq[4175]: read /var/lib/neutron/dhcp/da73026e-09b9-4f8d-bbdd-84d89c2487b2/addn_hosts - 28 addresses
Feb 6 00:15:43 dnsmasq[4175]: duplicate dhcp-host IP address 10.2.0.194 at line 28 of /var/lib/neutron/dhcp/da73026e-09b9-4f8d-bbdd-84d89c2487b2/host
So it must be something wrong with neutron-dhcp-agent network cache.
From neutron-dhcp-agent log:
2020-02-06 00:15:20.282 40 DEBUG neutron.agent.dhcp.agent [req-f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] Resync event has been scheduled _periodic_resync_helper /var/lib/openstack/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py:276
2020-02-06 00:15:20.282 40 DEBUG neutron.common.utils [req-f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] Calling throttled function clear wrapper /var/lib/openstack/lib/python3.6/site-packages/neutron/common/utils.py:102
2020-02-06 00:15:20.283 40 DEBUG neutron.agent.dhcp.agent [req-f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] resync (da73026e-09b9-4f8d-bbdd-84d89c2487b2): ['Duplicate IP addresses found, DHCP cache is out of sync'] _periodic_resync_helper /var/lib/openstack/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py:293
so the agent is aware of invalid cache for the net, but for unknown
reason actual net resync happens only in 8 minutes:
2020-02-06 00:23:55.297 40 INFO neutron.agent.dhcp.agent [req-
f5107bdd-d53a-4171-a283-de3d7cf7c708 - - - - -] Synchronizing state
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1862315/+subscriptions
Follow ups