← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1677131] [NEW] race condition between interface detach and port delete

 

Public bug reported:

Environment:
OpenStack Mitaka, Ubuntu 16.04, Neutron with DVR mode, 3 control nodes and 8 compute nodes, and dhcp-agent run on compute node.

Steps To Reproduce:
1. Create port with a given MAC address
2. Create an instance with the port created by last step
3. Detach port from the instance
4. Delete port
5. Create port with same MAC address
6. Attach port to the instance
7. Detach port from the instance
8. Delete port
and repeat step 5 - 8 until the instance get wrong IP address from dhcp-agent

Observed Result:
1. The instance get wrong IP address from dhcp-agent
2. /var/lib/neutron/dhcp/${network_id}/host on compute node have duplicate MAC address with different IP Address

When detaching a port, nova-compute will send a request (PUT
/v2.0/ports/${port_id}) to neutron, and then we call delete port (DELETE
/v2.0/ports/${port_id}) to neutron, these requests arrive neutron in
order, but neutron send delete notification to dhcp-agent first (dhcp-
agent will remove this port record in
/var/lib/neutron/dhcp/${network_id}/host file), and then send put
notification (dhcp-agent will flush this port record to the same file,
but this record is removed in neutron db, so if we create port with same
MAC address again, there are two record with same MAC address in the
host file).

Some useful logs:
/var/log/neutron/neutron-server.log [http://paste.openstack.org/show/604613/]     --- timezone: CDT
/var/log/neutron/neutron-dhcp-agent.log [http://paste.openstack.org/show/604614/] --- timezone: CEST

Using request id to trace above logs, sorry for my misconfiguration of
timezone.

Perceived severity:
I think the severity is high to me. We use many port operations on our production environment which serve 100+ VM. I think have found the root cause, but I have no idea to solve this problem. Sincerely hope that you can help me

related issue:
https://bugs.launchpad.net/neutron/+bug/1288493

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1677131

Title:
  race condition between interface detach and port delete

Status in neutron:
  New

Bug description:
  Environment:
  OpenStack Mitaka, Ubuntu 16.04, Neutron with DVR mode, 3 control nodes and 8 compute nodes, and dhcp-agent run on compute node.

  Steps To Reproduce:
  1. Create port with a given MAC address
  2. Create an instance with the port created by last step
  3. Detach port from the instance
  4. Delete port
  5. Create port with same MAC address
  6. Attach port to the instance
  7. Detach port from the instance
  8. Delete port
  and repeat step 5 - 8 until the instance get wrong IP address from dhcp-agent

  Observed Result:
  1. The instance get wrong IP address from dhcp-agent
  2. /var/lib/neutron/dhcp/${network_id}/host on compute node have duplicate MAC address with different IP Address

  When detaching a port, nova-compute will send a request (PUT
  /v2.0/ports/${port_id}) to neutron, and then we call delete port
  (DELETE /v2.0/ports/${port_id}) to neutron, these requests arrive
  neutron in order, but neutron send delete notification to dhcp-agent
  first (dhcp-agent will remove this port record in
  /var/lib/neutron/dhcp/${network_id}/host file), and then send put
  notification (dhcp-agent will flush this port record to the same file,
  but this record is removed in neutron db, so if we create port with
  same MAC address again, there are two record with same MAC address in
  the host file).

  Some useful logs:
  /var/log/neutron/neutron-server.log [http://paste.openstack.org/show/604613/]     --- timezone: CDT
  /var/log/neutron/neutron-dhcp-agent.log [http://paste.openstack.org/show/604614/] --- timezone: CEST

  Using request id to trace above logs, sorry for my misconfiguration of
  timezone.

  Perceived severity:
  I think the severity is high to me. We use many port operations on our production environment which serve 100+ VM. I think have found the root cause, but I have no idea to solve this problem. Sincerely hope that you can help me

  related issue:
  https://bugs.launchpad.net/neutron/+bug/1288493

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1677131/+subscriptions


Follow ups