← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1745468] Re: Conntrack entry removal can take a long time on large deployments

 

Reviewed:  https://review.openstack.org/537654
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=65a81623fc0377b26d2d5800607f7c3acc08c45a
Submitter: Zuul
Branch:    master

commit 65a81623fc0377b26d2d5800607f7c3acc08c45a
Author: Brian Haley <bhaley@xxxxxxxxxx>
Date:   Wed Jan 24 15:55:56 2018 -0500

    Process conntrack updates in worker threads
    
    With a large number of instances and/or security group rules,
    conntrack updates when ports are removed or rules are changed
    can take a long time to process.  By enqueuing these to a set
    or worker threads, the agent can continue with other work while
    they are processed in the background.
    
    This is a change in behavior in the agent since it could
    program a new set of security group rules before all existing
    conntrack entries are deleted, but since the iptables or OVSfw
    NAT rules will have been removed, it should not pose a
    security issue.
    
    Change-Id: Ibf858c7fdf7a822a30e4a0c4722d70fd272741b6
    Closes-bug: #1745468


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1745468

Title:
  Conntrack entry removal can take a long time on large deployments

Status in neutron:
  Fix Released

Bug description:
  On a large deployment of about 1000 instances, instance deletion
  (neutron port deletion) or security group rule changes can take a
  really long time.  We've actually seen it take hours in some
  instances.

  While changing to netlink-lib for the IP Conntrack manager will help,
  https://review.openstack.org/#/c/470912/ it could still lead to long
  delays at higher instance counts.  Also, that change might not be
  easily back-portable to older releases.  Doing the conntrack entry
  deletion in a thread, which has been proposed before, could help
  alleviate this a bit by letting the caller (OVS agent) get back to
  other work quicker.

  Also, while the netlink-lib change above is better at only issuing
  calls for entries it finds, the current code doesn't do that, it could
  call 'conntrack -D' with arguments for nothing.  If we first checked
  the table for given IPs it might reduce the time it takes for cleanup.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1745468/+subscriptions


References