← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1745468] [NEW] Conntrack entry removal can take a long time on large deployments

 

Public bug reported:

On a large deployment of about 1000 instances, instance deletion
(neutron port deletion) or security group rule changes can take a really
long time.  We've actually seen it take hours in some instances.

While changing to netlink-lib for the IP Conntrack manager will help,
https://review.openstack.org/#/c/470912/ it could still lead to long
delays at higher instance counts.  Also, that change might not be easily
back-portable to older releases.  Doing the conntrack entry deletion in
a thread, which has been proposed before, could help alleviate this a
bit by letting the caller (OVS agent) get back to other work quicker.

Also, while the netlink-lib change above is better at only issuing calls
for entries it finds, the current code doesn't do that, it could call
'conntrack -D' with arguments for nothing.  If we first checked the
table for given IPs it might reduce the time it takes for cleanup.

** Affects: neutron
     Importance: High
     Assignee: Brian Haley (brian-haley)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1745468

Title:
  Conntrack entry removal can take a long time on large deployments

Status in neutron:
  In Progress

Bug description:
  On a large deployment of about 1000 instances, instance deletion
  (neutron port deletion) or security group rule changes can take a
  really long time.  We've actually seen it take hours in some
  instances.

  While changing to netlink-lib for the IP Conntrack manager will help,
  https://review.openstack.org/#/c/470912/ it could still lead to long
  delays at higher instance counts.  Also, that change might not be
  easily back-portable to older releases.  Doing the conntrack entry
  deletion in a thread, which has been proposed before, could help
  alleviate this a bit by letting the caller (OVS agent) get back to
  other work quicker.

  Also, while the netlink-lib change above is better at only issuing
  calls for entries it finds, the current code doesn't do that, it could
  call 'conntrack -D' with arguments for nothing.  If we first checked
  the table for given IPs it might reduce the time it takes for cleanup.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1745468/+subscriptions


Follow ups