yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #62736
[Bug 1677131] [NEW] race condition between interface detach and port delete
Public bug reported:
Environment:
OpenStack Mitaka, Ubuntu 16.04, Neutron with DVR mode, 3 control nodes and 8 compute nodes, and dhcp-agent run on compute node.
Steps To Reproduce:
1. Create port with a given MAC address
2. Create an instance with the port created by last step
3. Detach port from the instance
4. Delete port
5. Create port with same MAC address
6. Attach port to the instance
7. Detach port from the instance
8. Delete port
and repeat step 5 - 8 until the instance get wrong IP address from dhcp-agent
Observed Result:
1. The instance get wrong IP address from dhcp-agent
2. /var/lib/neutron/dhcp/${network_id}/host on compute node have duplicate MAC address with different IP Address
When detaching a port, nova-compute will send a request (PUT
/v2.0/ports/${port_id}) to neutron, and then we call delete port (DELETE
/v2.0/ports/${port_id}) to neutron, these requests arrive neutron in
order, but neutron send delete notification to dhcp-agent first (dhcp-
agent will remove this port record in
/var/lib/neutron/dhcp/${network_id}/host file), and then send put
notification (dhcp-agent will flush this port record to the same file,
but this record is removed in neutron db, so if we create port with same
MAC address again, there are two record with same MAC address in the
host file).
Some useful logs:
/var/log/neutron/neutron-server.log [http://paste.openstack.org/show/604613/] --- timezone: CDT
/var/log/neutron/neutron-dhcp-agent.log [http://paste.openstack.org/show/604614/] --- timezone: CEST
Using request id to trace above logs, sorry for my misconfiguration of
timezone.
Perceived severity:
I think the severity is high to me. We use many port operations on our production environment which serve 100+ VM. I think have found the root cause, but I have no idea to solve this problem. Sincerely hope that you can help me
related issue:
https://bugs.launchpad.net/neutron/+bug/1288493
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1677131
Title:
race condition between interface detach and port delete
Status in neutron:
New
Bug description:
Environment:
OpenStack Mitaka, Ubuntu 16.04, Neutron with DVR mode, 3 control nodes and 8 compute nodes, and dhcp-agent run on compute node.
Steps To Reproduce:
1. Create port with a given MAC address
2. Create an instance with the port created by last step
3. Detach port from the instance
4. Delete port
5. Create port with same MAC address
6. Attach port to the instance
7. Detach port from the instance
8. Delete port
and repeat step 5 - 8 until the instance get wrong IP address from dhcp-agent
Observed Result:
1. The instance get wrong IP address from dhcp-agent
2. /var/lib/neutron/dhcp/${network_id}/host on compute node have duplicate MAC address with different IP Address
When detaching a port, nova-compute will send a request (PUT
/v2.0/ports/${port_id}) to neutron, and then we call delete port
(DELETE /v2.0/ports/${port_id}) to neutron, these requests arrive
neutron in order, but neutron send delete notification to dhcp-agent
first (dhcp-agent will remove this port record in
/var/lib/neutron/dhcp/${network_id}/host file), and then send put
notification (dhcp-agent will flush this port record to the same file,
but this record is removed in neutron db, so if we create port with
same MAC address again, there are two record with same MAC address in
the host file).
Some useful logs:
/var/log/neutron/neutron-server.log [http://paste.openstack.org/show/604613/] --- timezone: CDT
/var/log/neutron/neutron-dhcp-agent.log [http://paste.openstack.org/show/604614/] --- timezone: CEST
Using request id to trace above logs, sorry for my misconfiguration of
timezone.
Perceived severity:
I think the severity is high to me. We use many port operations on our production environment which serve 100+ VM. I think have found the root cause, but I have no idea to solve this problem. Sincerely hope that you can help me
related issue:
https://bugs.launchpad.net/neutron/+bug/1288493
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1677131/+subscriptions
Follow ups