← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2112648] [NEW] [OVS Firewall] Remote-group member flows not restored after port rebind with same IP/MAC during migration to OVS

 

Public bug reported:

While migrating from ML2/linuxbridge to ML2/openvswitch using the native
OVS firewall driver (Openstack Caracal 2024.1), I observed that remote-
group-based security group rules stop functioning after source security
group Virtual Machine is redeployed but persisting port(s), reusing them
(the port retains its IP and MAC address).

Remote-group match flows are not restored on target Virtual machines.

Looks to me this occurs because the security_group_member_updated() RPC
is not triggered during port recreation if:

* The port keeps the same IP/MAC.
* The associated security group memberships remain unchanged.
* The port status appears as DOWN (from the plugin’s view), while the binding status is ACTIVE.

As a result, the OVS agent does not update the remote-group address
buckets, and the receiving VM does not permit traffic from the recreated
sender and breaking connectivity for security group rules using remote-
group-id.

Steps to Reproduce:
* Configure security group rules with remote-group-id references.
* Deploy (source) VM A and (target) VM B, like on the same host, with SG rules allowing mutual access using remote-group references.
* Delete and recreate VM A, but persist the existing port(s).
* Observe that VM B does not receive traffic from VM A.
* Dump flows with ovs-ofctl on br-int, the source IP of VM A is missing from the conjunction-based rules on VM B’s side.

Restarting neutron-openvswitch-agent restores expected flows.

Proposed Fix/Workaround:

```
diff --git a/neutron/agent/securitygroups_rpc.py b/neutron/agent/securitygroups_rpc.py
index b1ae020446..980a5bc575 100644
--- a/neutron/agent/securitygroups_rpc.py
+++ b/neutron/agent/securitygroups_rpc.py
@@ -293,6 +293,18 @@ class SecurityGroupAgentRpc(object):
             LOG.debug("Preparing device filters for %d new devices",
                       len(new_devices))
             self.prepare_devices_filter(new_devices)
+
+            rebound_sg_ids = set()
+            for dev_id in new_devices:
+                port = self.firewall.ports.get(dev_id)
+                LOG.debug("Dump port = %s", port)
+                if port and port.get('status') == 'DOWN' and any(b.get('status') == 'ACTIVE' for b in port.get('bindings', [])):
+                    rebound_sg_ids.update(port.get('security_groups', []))
+            if rebound_sg_ids:
+                LOG.debug("Refreshing SGs: %s", rebound_sg_ids)
+                # triggers firewall.update_security_group_members()
+                self.security_groups_member_updated(rebound_sg_ids)
+
         if updated_devices:
             self.firewall.security_group_updated('sg_member', [],
                                                  updated_devices)

```

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2112648

Title:
  [OVS Firewall] Remote-group member flows not restored after port
  rebind with same IP/MAC during migration to OVS

Status in neutron:
  New

Bug description:
  While migrating from ML2/linuxbridge to ML2/openvswitch using the
  native OVS firewall driver (Openstack Caracal 2024.1), I observed that
  remote-group-based security group rules stop functioning after source
  security group Virtual Machine is redeployed but persisting port(s),
  reusing them (the port retains its IP and MAC address).

  Remote-group match flows are not restored on target Virtual machines.

  Looks to me this occurs because the security_group_member_updated()
  RPC is not triggered during port recreation if:

  * The port keeps the same IP/MAC.
  * The associated security group memberships remain unchanged.
  * The port status appears as DOWN (from the plugin’s view), while the binding status is ACTIVE.

  As a result, the OVS agent does not update the remote-group address
  buckets, and the receiving VM does not permit traffic from the
  recreated sender and breaking connectivity for security group rules
  using remote-group-id.

  Steps to Reproduce:
  * Configure security group rules with remote-group-id references.
  * Deploy (source) VM A and (target) VM B, like on the same host, with SG rules allowing mutual access using remote-group references.
  * Delete and recreate VM A, but persist the existing port(s).
  * Observe that VM B does not receive traffic from VM A.
  * Dump flows with ovs-ofctl on br-int, the source IP of VM A is missing from the conjunction-based rules on VM B’s side.

  Restarting neutron-openvswitch-agent restores expected flows.

  Proposed Fix/Workaround:

  ```
  diff --git a/neutron/agent/securitygroups_rpc.py b/neutron/agent/securitygroups_rpc.py
  index b1ae020446..980a5bc575 100644
  --- a/neutron/agent/securitygroups_rpc.py
  +++ b/neutron/agent/securitygroups_rpc.py
  @@ -293,6 +293,18 @@ class SecurityGroupAgentRpc(object):
               LOG.debug("Preparing device filters for %d new devices",
                         len(new_devices))
               self.prepare_devices_filter(new_devices)
  +
  +            rebound_sg_ids = set()
  +            for dev_id in new_devices:
  +                port = self.firewall.ports.get(dev_id)
  +                LOG.debug("Dump port = %s", port)
  +                if port and port.get('status') == 'DOWN' and any(b.get('status') == 'ACTIVE' for b in port.get('bindings', [])):
  +                    rebound_sg_ids.update(port.get('security_groups', []))
  +            if rebound_sg_ids:
  +                LOG.debug("Refreshing SGs: %s", rebound_sg_ids)
  +                # triggers firewall.update_security_group_members()
  +                self.security_groups_member_updated(rebound_sg_ids)
  +
           if updated_devices:
               self.firewall.security_group_updated('sg_member', [],
                                                    updated_devices)

  ```

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2112648/+subscriptions