yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95981
[Bug 2112648] [NEW] [OVS Firewall] Remote-group member flows not restored after port rebind with same IP/MAC during migration to OVS
Public bug reported:
While migrating from ML2/linuxbridge to ML2/openvswitch using the native
OVS firewall driver (Openstack Caracal 2024.1), I observed that remote-
group-based security group rules stop functioning after source security
group Virtual Machine is redeployed but persisting port(s), reusing them
(the port retains its IP and MAC address).
Remote-group match flows are not restored on target Virtual machines.
Looks to me this occurs because the security_group_member_updated() RPC
is not triggered during port recreation if:
* The port keeps the same IP/MAC.
* The associated security group memberships remain unchanged.
* The port status appears as DOWN (from the plugin’s view), while the binding status is ACTIVE.
As a result, the OVS agent does not update the remote-group address
buckets, and the receiving VM does not permit traffic from the recreated
sender and breaking connectivity for security group rules using remote-
group-id.
Steps to Reproduce:
* Configure security group rules with remote-group-id references.
* Deploy (source) VM A and (target) VM B, like on the same host, with SG rules allowing mutual access using remote-group references.
* Delete and recreate VM A, but persist the existing port(s).
* Observe that VM B does not receive traffic from VM A.
* Dump flows with ovs-ofctl on br-int, the source IP of VM A is missing from the conjunction-based rules on VM B’s side.
Restarting neutron-openvswitch-agent restores expected flows.
Proposed Fix/Workaround:
```
diff --git a/neutron/agent/securitygroups_rpc.py b/neutron/agent/securitygroups_rpc.py
index b1ae020446..980a5bc575 100644
--- a/neutron/agent/securitygroups_rpc.py
+++ b/neutron/agent/securitygroups_rpc.py
@@ -293,6 +293,18 @@ class SecurityGroupAgentRpc(object):
LOG.debug("Preparing device filters for %d new devices",
len(new_devices))
self.prepare_devices_filter(new_devices)
+
+ rebound_sg_ids = set()
+ for dev_id in new_devices:
+ port = self.firewall.ports.get(dev_id)
+ LOG.debug("Dump port = %s", port)
+ if port and port.get('status') == 'DOWN' and any(b.get('status') == 'ACTIVE' for b in port.get('bindings', [])):
+ rebound_sg_ids.update(port.get('security_groups', []))
+ if rebound_sg_ids:
+ LOG.debug("Refreshing SGs: %s", rebound_sg_ids)
+ # triggers firewall.update_security_group_members()
+ self.security_groups_member_updated(rebound_sg_ids)
+
if updated_devices:
self.firewall.security_group_updated('sg_member', [],
updated_devices)
```
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2112648
Title:
[OVS Firewall] Remote-group member flows not restored after port
rebind with same IP/MAC during migration to OVS
Status in neutron:
New
Bug description:
While migrating from ML2/linuxbridge to ML2/openvswitch using the
native OVS firewall driver (Openstack Caracal 2024.1), I observed that
remote-group-based security group rules stop functioning after source
security group Virtual Machine is redeployed but persisting port(s),
reusing them (the port retains its IP and MAC address).
Remote-group match flows are not restored on target Virtual machines.
Looks to me this occurs because the security_group_member_updated()
RPC is not triggered during port recreation if:
* The port keeps the same IP/MAC.
* The associated security group memberships remain unchanged.
* The port status appears as DOWN (from the plugin’s view), while the binding status is ACTIVE.
As a result, the OVS agent does not update the remote-group address
buckets, and the receiving VM does not permit traffic from the
recreated sender and breaking connectivity for security group rules
using remote-group-id.
Steps to Reproduce:
* Configure security group rules with remote-group-id references.
* Deploy (source) VM A and (target) VM B, like on the same host, with SG rules allowing mutual access using remote-group references.
* Delete and recreate VM A, but persist the existing port(s).
* Observe that VM B does not receive traffic from VM A.
* Dump flows with ovs-ofctl on br-int, the source IP of VM A is missing from the conjunction-based rules on VM B’s side.
Restarting neutron-openvswitch-agent restores expected flows.
Proposed Fix/Workaround:
```
diff --git a/neutron/agent/securitygroups_rpc.py b/neutron/agent/securitygroups_rpc.py
index b1ae020446..980a5bc575 100644
--- a/neutron/agent/securitygroups_rpc.py
+++ b/neutron/agent/securitygroups_rpc.py
@@ -293,6 +293,18 @@ class SecurityGroupAgentRpc(object):
LOG.debug("Preparing device filters for %d new devices",
len(new_devices))
self.prepare_devices_filter(new_devices)
+
+ rebound_sg_ids = set()
+ for dev_id in new_devices:
+ port = self.firewall.ports.get(dev_id)
+ LOG.debug("Dump port = %s", port)
+ if port and port.get('status') == 'DOWN' and any(b.get('status') == 'ACTIVE' for b in port.get('bindings', [])):
+ rebound_sg_ids.update(port.get('security_groups', []))
+ if rebound_sg_ids:
+ LOG.debug("Refreshing SGs: %s", rebound_sg_ids)
+ # triggers firewall.update_security_group_members()
+ self.security_groups_member_updated(rebound_sg_ids)
+
if updated_devices:
self.firewall.security_group_updated('sg_member', [],
updated_devices)
```
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2112648/+subscriptions