← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1688573] [NEW] neutron-fwaas iptables driver performs awful

 

Public bug reported:

Details
=======
>From Liberty onwards the neutron iptables_manager started using difflib.ndiff to check for changes to iptables rules.  This function does way more than it needs to and takes exponential time, so by the time you have enough rules the l3-agent eats 100% CPU and suffers from RPC timeouts effectively taking the agent down.

There are a couple elements in play here.  First the generation of
iptables rules in neutron-fwass is utterly wrong guaranteeing that every
line diffed will be different.  My first patch addresses this issue
ensuring fields are correctly ordered, sub-fields within lines are
correctly ordered, source and destination prefixes are normalized.

As an experiment I increased firewall rules from 1 -> 1024 (2^N
progression) to measure the damage in devstack.  The time complexity was
roughly 0.0045 x N^2.  At 512 (~120s) we started getting timeouts and
had the l3 agent spinning constantly broken unable to apply the
configuration.

Applying the fixes to the formatting reduced this to 0.004521s for 512
entries.  I suspect looking at the code for ndiff this will equate to
roughly 65536 firewall rules before the system keels over again.

The second issue is with ndiff itself.  If we end up in this situation
again where all lines are different it correctly discovers this in
O(N^2) time, however it also tries to diff pairs of lines that look
alike.  This is utterly superfluous to the algorithm in neutron and
basically causes a huge performance penalty.  Given we've been throwing
away the whole ruleset and reinstalling it each time for 2 years you may
as well replace it with an O(N) list compare ;D  The other thing would
be to parse iptables output into an internal representation and compare
those which are not at the whimsical mercy of a 3rd party.

Version
=======
Liberty -> Present 

Severity
========
High - we have customers with over 1500 rules, and having them able to DoS our L3 network service is not great

** Affects: neutron
     Importance: Undecided
     Assignee: Simon Murray (simon-murray-q)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1688573

Title:
  neutron-fwaas iptables driver performs awful

Status in neutron:
  In Progress

Bug description:
  Details
  =======
  From Liberty onwards the neutron iptables_manager started using difflib.ndiff to check for changes to iptables rules.  This function does way more than it needs to and takes exponential time, so by the time you have enough rules the l3-agent eats 100% CPU and suffers from RPC timeouts effectively taking the agent down.

  There are a couple elements in play here.  First the generation of
  iptables rules in neutron-fwass is utterly wrong guaranteeing that
  every line diffed will be different.  My first patch addresses this
  issue ensuring fields are correctly ordered, sub-fields within lines
  are correctly ordered, source and destination prefixes are normalized.

  As an experiment I increased firewall rules from 1 -> 1024 (2^N
  progression) to measure the damage in devstack.  The time complexity
  was roughly 0.0045 x N^2.  At 512 (~120s) we started getting timeouts
  and had the l3 agent spinning constantly broken unable to apply the
  configuration.

  Applying the fixes to the formatting reduced this to 0.004521s for 512
  entries.  I suspect looking at the code for ndiff this will equate to
  roughly 65536 firewall rules before the system keels over again.

  The second issue is with ndiff itself.  If we end up in this situation
  again where all lines are different it correctly discovers this in
  O(N^2) time, however it also tries to diff pairs of lines that look
  alike.  This is utterly superfluous to the algorithm in neutron and
  basically causes a huge performance penalty.  Given we've been
  throwing away the whole ruleset and reinstalling it each time for 2
  years you may as well replace it with an O(N) list compare ;D  The
  other thing would be to parse iptables output into an internal
  representation and compare those which are not at the whimsical mercy
  of a 3rd party.

  Version
  =======
  Liberty -> Present 

  Severity
  ========
  High - we have customers with over 1500 rules, and having them able to DoS our L3 network service is not great

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1688573/+subscriptions