← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1644535] [NEW] iptables: fail to start ovs/linuxbridge agents on missing sysctl knobs

 

Public bug reported:

https://review.openstack.org/398817
Dear bug triager. This bug was created since a commit was marked with DOCIMPACT.
Your project "openstack/neutron" is set up so that we directly report the documentation bugs against it. If this needs changing, the docimpact-group option needs to be added for the project. You can ask the OpenStack infra team (#openstack-infra on freenode) for help if you need to.

commit 4371a4f5cdc6559955af9158c4c28851e77914da
Author: Ihar Hrachyshka <ihrachys@xxxxxxxxxx>
Date:   Thu Sep 15 21:48:10 2016 +0000

    iptables: fail to start ovs/linuxbridge agents on missing sysctl knobs
    
    For new kernels (3.18+), bridge module is split into two pieces: bridge
    and br_netfilter. The latter provides firewall support for bridged
    traffic, as well as the following sysctl knobs:
    
    * net.bridge.bridge-nf-call-arptables
    * net.bridge.bridge-nf-call-ip6tables
    * net.bridge.bridge-nf-call-iptables
    
    Before kernel 3.18, any brctl command was loading the 'bridge' module
    with the knobs, so at the moment where we reached iptables setup, they
    were always available.
    
    With new 3.18+ kernels, brctl still loads 'bridge' module, but not
    br_netfilter. So bridge existance no longer guarantees us knobs'
    presence. If we reach _enable_netfilter_for_bridges before the new
    module is loaded, then the code will fail, triggering agent resync. It
    will also fail to enable bridge firewalling on systems where it's
    disabled by default (examples of those systems are most if not all Red
    Hat/Fedora based systems), making security groups completely
    ineffective.
    
    Systems that don't override default settings for those knobs would work
    fine except for this exception in the log file and agent resync. This is
    because the first attempt to add a iptables rule using 'physdev' module
    (-m physdev) will trigger the kernel module loading. In theory, we could
    silently swallow missing knobs, and still operate correctly. But on
    second thought, it's quite fragile to rely on that implicit module
    loading. In the case where we can't detect whether firewall is enabled,
    it's better to fail than hope for the best.
    
    An alternative to the proposed path could be trying
    to fix broken deployment, meaning we would need to load the missing
    kernel module on agent startup. It's not even clear whether we can
    assume the operation would be available to us. Even with that, adding a
    rootwrap filter to allow loading code in the kernel sounds quite scary.
    If we would follow the path, we would also hit an issue of
    distinguishing between cases of built-in kernel module vs. modular one.
    A complexity that is probably beyond what Neutron should fix.
    
    The patch introduces a sanity check that would fail on missing
    configuration knobs.
    
    DocImpact: document the new deployment requirement in operations guide
    UpgradeImpact: deployers relying on agents fixing wrong sysctl defaults
                   will need to make sure bridge firewalling is enabled.
                   Also, the kernel module providing sysctl knobs must be
                   loaded before starting the agent, otherwise it will fail
                   to start.
    
    Changes made to this backport:
       neutron/agent/linux/iptables_firewall.py
           - removed deprecation warning when setting sysctl values to 1 as
             they are planned to be removed in Ocata
       neutron/cmd/sanity/checks.py
           - Re-implemented the flow to check only for presence of sysctl
             options instead of checking the values. Kernel options are set
             in runtime thus the values don't matter.
    
    Depends-On: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
    Change-Id: I9137ea017624ac92a05f73863b77f9ee4681bbe7
    Related-Bug: #1622914
    (cherry picked from commit e83a44b96a8e3cd81b7cc684ac90486b283a3507)

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: doc neutron

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1644535

Title:
      iptables: fail to start ovs/linuxbridge agents on missing sysctl
  knobs

Status in neutron:
  New

Bug description:
  https://review.openstack.org/398817
  Dear bug triager. This bug was created since a commit was marked with DOCIMPACT.
  Your project "openstack/neutron" is set up so that we directly report the documentation bugs against it. If this needs changing, the docimpact-group option needs to be added for the project. You can ask the OpenStack infra team (#openstack-infra on freenode) for help if you need to.

  commit 4371a4f5cdc6559955af9158c4c28851e77914da
  Author: Ihar Hrachyshka <ihrachys@xxxxxxxxxx>
  Date:   Thu Sep 15 21:48:10 2016 +0000

      iptables: fail to start ovs/linuxbridge agents on missing sysctl knobs
      
      For new kernels (3.18+), bridge module is split into two pieces: bridge
      and br_netfilter. The latter provides firewall support for bridged
      traffic, as well as the following sysctl knobs:
      
      * net.bridge.bridge-nf-call-arptables
      * net.bridge.bridge-nf-call-ip6tables
      * net.bridge.bridge-nf-call-iptables
      
      Before kernel 3.18, any brctl command was loading the 'bridge' module
      with the knobs, so at the moment where we reached iptables setup, they
      were always available.
      
      With new 3.18+ kernels, brctl still loads 'bridge' module, but not
      br_netfilter. So bridge existance no longer guarantees us knobs'
      presence. If we reach _enable_netfilter_for_bridges before the new
      module is loaded, then the code will fail, triggering agent resync. It
      will also fail to enable bridge firewalling on systems where it's
      disabled by default (examples of those systems are most if not all Red
      Hat/Fedora based systems), making security groups completely
      ineffective.
      
      Systems that don't override default settings for those knobs would work
      fine except for this exception in the log file and agent resync. This is
      because the first attempt to add a iptables rule using 'physdev' module
      (-m physdev) will trigger the kernel module loading. In theory, we could
      silently swallow missing knobs, and still operate correctly. But on
      second thought, it's quite fragile to rely on that implicit module
      loading. In the case where we can't detect whether firewall is enabled,
      it's better to fail than hope for the best.
      
      An alternative to the proposed path could be trying
      to fix broken deployment, meaning we would need to load the missing
      kernel module on agent startup. It's not even clear whether we can
      assume the operation would be available to us. Even with that, adding a
      rootwrap filter to allow loading code in the kernel sounds quite scary.
      If we would follow the path, we would also hit an issue of
      distinguishing between cases of built-in kernel module vs. modular one.
      A complexity that is probably beyond what Neutron should fix.
      
      The patch introduces a sanity check that would fail on missing
      configuration knobs.
      
      DocImpact: document the new deployment requirement in operations guide
      UpgradeImpact: deployers relying on agents fixing wrong sysctl defaults
                     will need to make sure bridge firewalling is enabled.
                     Also, the kernel module providing sysctl knobs must be
                     loaded before starting the agent, otherwise it will fail
                     to start.
      
      Changes made to this backport:
         neutron/agent/linux/iptables_firewall.py
             - removed deprecation warning when setting sysctl values to 1 as
               they are planned to be removed in Ocata
         neutron/cmd/sanity/checks.py
             - Re-implemented the flow to check only for presence of sysctl
               options instead of checking the values. Kernel options are set
               in runtime thus the values don't matter.
      
      Depends-On: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
      Change-Id: I9137ea017624ac92a05f73863b77f9ee4681bbe7
      Related-Bug: #1622914
      (cherry picked from commit e83a44b96a8e3cd81b7cc684ac90486b283a3507)

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1644535/+subscriptions


Follow ups