← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2019449] Re: Neutron server log file full of tracebacks in neutron-tempest-plugin-ovn job

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/883582
Committed: https://opendev.org/openstack/neutron/commit/ebc0658d5566ce527cb1104968d247db10edf3db
Submitter: "Zuul (22348)"
Branch:    master

commit ebc0658d5566ce527cb1104968d247db10edf3db
Author: Brian Haley <haleyb.dev@xxxxxxxxx>
Date:   Fri May 19 14:45:53 2023 +0000

    Revert "Delete sg rule which remote is the deleted sg"
    
    This reverts commit 63584957203ec9f5ba165177978213c3909f81f0.
    
    Reason for revert: This is generating a lot of
    "SecurityGroupNotFound" errors in neutron-server.log in
    the tempest-integrated-networking job.
    
    Closes-Bug: #2019449
    Related-Bug: #2008712
    Change-Id: I077fe87435f61bd29d5c1efc979c2adebca26181


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2019449

Title:
  Neutron server log file full of tracebacks in neutron-tempest-plugin-
  ovn job

Status in neutron:
  Fix Released

Bug description:
  Since we merged
  https://review.opendev.org/c/openstack/neutron/+/876716, the Neutron
  server log in the neutron-tempest-plugin-ovn job is full of two types
  of tracebacks. The first type is

  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager [None req-f256cbd3-ea20-49e8-a9a0-0299fc7f34f3 tempest-SecGroupRulesQuotaTest-217177324 tempest-SecGroupRulesQuotaTest-217177324-project-member] Error during notification for neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.OVNMechanismDriver._process_sg_rule_after_del_notification-316503 security_group_rule, after_delete: KeyError: 'rule'
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/callbacks/manager.py", line 181, in _notify_loop
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     callback(resource, event, trigger, payload=payload)
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", line 463, in _process_sg_rule_after_del_notification
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     sg_rule = payload.metadata['rule']
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager KeyError: 'rule'
  May 02 11:29:20.790754 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager 

  This happens because:

  1) The patch mentioned above now publishes AFTER_DELETE events for
  security group rules that include the rule deleted:
  https://review.opendev.org/c/openstack/neutron/+/876716/5/neutron/db/securitygroups_db.py#293.
  The same patch added an event handler that expects to find the deleted
  security group rule in the payload:
  https://review.opendev.org/c/openstack/neutron/+/876716/5/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#459

  2) However, previous to this patch, we were already publishing
  AFTER_DELETE events for security group rules that don't include the
  security group rule:
  https://github.com/openstack/neutron/blob/47d4ec4e99d5aae62656c88206eb6a77f70d4a8b/neutron/db/securitygroups_db.py#L901-L908.
  As a consequence, the new handler added by
  https://review.opendev.org/c/openstack/neutron/+/876716 produces the
  KeyError exception shown above when trying to access the security
  group rule in the payload when processing the preexisting event.

  
  The second type of traceback is:

  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager [None req-4f41bf6e-d3c6-42a8-a987-b062a9a6b375 admin admin] Error during notification for neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.OVNMechanismDriver._process_sg_rule_after_del_notification-316503 security_group_rule, after_delete: neutron.extensions.securitygroup.SecurityGroupNotFound: Security group ae8107d1-0399-4491-8535-4aee2c9de5d0 does not exist
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/callbacks/manager.py", line 181, in _notify_loop
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     callback(resource, event, trigger, payload=payload)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", line 464, in _process_sg_rule_after_del_notification
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     self._ovn_client.delete_security_group_rule(context, sg_rule)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py", line 2413, in delete_security_group_rule
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     self._process_security_group_rule(rule, is_add_acl=False)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py", line 2401, in _process_security_group_rule
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     ovn_acl.update_acls_for_security_group(
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/common/ovn/acl.py", line 294, in update_acls_for_security_group
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     sg = plugin.get_security_group(admin_context, security_group_id)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/db/api.py", line 226, in wrapped
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     return f_with_retry(*args, **kwargs,
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/db/api.py", line 142, in wrapped
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     setattr(e, '_RETRY_EXCEEDED', True)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 227, in __exit__
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     self.force_reraise()
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     raise self.value
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/db/api.py", line 138, in wrapped
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     return f(*args, **kwargs)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_db/api.py", line 154, in wrapper
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     ectxt.value = e.inner_exc
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 227, in __exit__
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     self.force_reraise()
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     raise self.value
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_db/api.py", line 142, in wrapper
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     return f(*args, **kwargs)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/db/api.py", line 190, in wrapped
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     context_reference.session.rollback()
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 227, in __exit__
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     self.force_reraise()
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     raise self.value
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/usr/local/lib/python3.8/dist-packages/neutron_lib/db/api.py", line 184, in wrapped
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     return f(*dup_args, **dup_kwargs)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/db/securitygroups_db.py", line 209, in get_security_group
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     ret = self._make_security_group_dict(self._get_security_group(
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager   File "/opt/stack/neutron/neutron/db/securitygroups_db.py", line 226, in _get_security_group
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager     raise ext_sg.SecurityGroupNotFound(id=id)
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager neutron.extensions.securitygroup.SecurityGroupNotFound: Security group ae8107d1-0399-4491-8535-4aee2c9de5d0 does not exist
  May 02 11:29:23.889104 np0033912622 neutron-server[100796]: ERROR neutron_lib.callbacks.manager

  This happens because the default security groups that we create for
  every project include two rules that point to the security group they
  belong to as the remote security group. As a consequence, when the
  event handler added by
  https://review.opendev.org/c/openstack/neutron/+/876716 tries to
  delete these type of rules, it doesn't find their security group,
  because it was deleted right before the event being handled was
  published:
  https://review.opendev.org/c/openstack/neutron/+/876716/5/neutron/db/securitygroups_db.py#281.

  For now, I have pushed a test patch that fixes both tracebacks:
  https://review.opendev.org/c/openstack/neutron/+/883047. I will report
  the result of zuul in a follow up comment.

  However, in revisiting
  https://review.opendev.org/c/openstack/neutron/+/876716 and the
  associated bug https://bugs.launchpad.net/neutron/+bug/2008712, a few
  questions came to my mind:

  1) In https://bugs.launchpad.net/neutron/+bug/2008712, it is reported
  that the bug affected the OVS mechanism driver and not OVN's. Why did
  we end up fixing OVN's? In fact, PS1 attempted to fix the bug for OVS
  here
  https://review.opendev.org/c/openstack/neutron/+/876716/1/neutron/api/rpc/handlers/securitygroups_rpc.py.
  Why was this approach abandoned?

  2) It seems to me that in
  https://review.opendev.org/c/openstack/neutron/+/876716 we violate the
  semantics of the AFTER_DELETE event for security group rules. We
  delete a security group and yet we publish a security group rule
  deletion event. Even more, the event carries in its payload the
  deleted rule, when this type of event normally doesn't carry the
  deleted resource, because it was already deleted. In fact,
  https://review.opendev.org/c/openstack/neutron/+/876716/1/neutron/api/rpc/handlers/securitygroups_rpc.py
  tried to fix the bug processing a security group AFTER_DELETE event,
  which seems logically consistent to me. Again, why did we abandon this
  approach?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2019449/+subscriptions



References