← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1302272] [NEW] neutron iptables manager is slow modifying a large amount of rules

 

Public bug reported:

Sudhakar Gariganti has noticed that with a very large number of iptables
rules that _modify_rules() was taking so long to complete (140 seconds)
that VMs couldn't be reliably booted because the rules weren't getting
put in place before the initial DHCP requests had timed out.  With a
small change the update can be done much quicker, and also allow each
node to support a larger set of iptables rules.

I've included a snippet from the related bug for reference,
https://bugs.launchpad.net/neutron/+bug/1253993

"We have done significant testing with this patch and want to share few
results from our experiments.

We were basically trying to see how many VMs we can scale with the OVS agent in use. With default security groups(which has remote security group), beyond 250-300 VMs, VMs were not able to get DHCP IPs. We were having 16 CNs, with VMs uniformly distributed across them. The VM image had a wait period of 120 secs to receive the DHCP response.
By the time we have around 18-19 VMs on each CN(there were around 6k Iptable rules), each RPC loop was taking close to 140 seconds(if there is any update). And the reason VMs were not getting IPs was that the Iptable rules required for the VM to send out the DHCP request were not in place before the 120 secs wait period. Upon further investigations we discovered that the "for loop searching iptable rules" in _modify_rules method of iptables_manger.py is eating a big chunk of the overall time spent.

After this patch, we were able to see close to 680 VMs were able to get
IPs. The number of Iptable rules at this point was close to 20K, with
around 40 VMs per CN.

To summarize, we were able to increase the processing capability of
compute node from 6K Iptable rules to 20K Iptable rules, which helped
more VMs get DHCP IP within the 120 sec wait period. You can imagine the
situation when the wait time is less than 120 secs."

** Affects: neutron
     Importance: Undecided
     Assignee: Brian Haley (brian-haley)
         Status: In Progress

** Changed in: neutron
     Assignee: (unassigned) => Brian Haley (brian-haley)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1302272

Title:
  neutron iptables manager is slow modifying a large amount of rules

Status in OpenStack Neutron (virtual network service):
  In Progress

Bug description:
  Sudhakar Gariganti has noticed that with a very large number of
  iptables rules that _modify_rules() was taking so long to complete
  (140 seconds) that VMs couldn't be reliably booted because the rules
  weren't getting put in place before the initial DHCP requests had
  timed out.  With a small change the update can be done much quicker,
  and also allow each node to support a larger set of iptables rules.

  I've included a snippet from the related bug for reference,
  https://bugs.launchpad.net/neutron/+bug/1253993

  "We have done significant testing with this patch and want to share
  few results from our experiments.

  We were basically trying to see how many VMs we can scale with the OVS agent in use. With default security groups(which has remote security group), beyond 250-300 VMs, VMs were not able to get DHCP IPs. We were having 16 CNs, with VMs uniformly distributed across them. The VM image had a wait period of 120 secs to receive the DHCP response.
  By the time we have around 18-19 VMs on each CN(there were around 6k Iptable rules), each RPC loop was taking close to 140 seconds(if there is any update). And the reason VMs were not getting IPs was that the Iptable rules required for the VM to send out the DHCP request were not in place before the 120 secs wait period. Upon further investigations we discovered that the "for loop searching iptable rules" in _modify_rules method of iptables_manger.py is eating a big chunk of the overall time spent.

  After this patch, we were able to see close to 680 VMs were able to
  get IPs. The number of Iptable rules at this point was close to 20K,
  with around 40 VMs per CN.

  To summarize, we were able to increase the processing capability of
  compute node from 6K Iptable rules to 20K Iptable rules, which helped
  more VMs get DHCP IP within the 120 sec wait period. You can imagine
  the situation when the wait time is less than 120 secs."

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1302272/+subscriptions


Follow ups

References