← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1944201] Re: neutron-openvswitch-agent crashes on start with firewall config of br-int

 

Reopen. It still happens, even with os-ken 2.2.0. And what I noticed it
may happened not only during agent initialization but also later. See
e.g.:

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2b1/812658/2/check/neutron-
ovs-tempest-multinode-full/2b1d77b/controller/logs/screen-q-agt.txt

or

https://e4aad2fd4ad948b107a3-f332441e20465e1f05e3f334ceb928b5.ssl.cf2.rackcdn.com/812658/2/check/neutron-
ovs-tempest-slow/2806209/compute1/logs/screen-q-agt.txt

** Changed in: neutron
       Status: Fix Released => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1944201

Title:
  neutron-openvswitch-agent crashes on start with firewall config of br-
  int

Status in neutron:
  Confirmed

Bug description:
  In upstream CI, Ironic jobs have been encountering failures where we
  never find the networking to be stood up by neutron. Investigation
  into what was going on led us to finding the neutron-openvswitch-agent
  in failed state, exited due to RuntimeError, just a few seconds after
  the service was started.

  neutron-openvswitch-agent[78787]: DEBUG neutron.agent.securitygroups_rpc [None req-b18a79b7-7258-44f0-9a69-fa92a490bc26 None None] Init firewall settings (driver=openvswitch) {{(pid=78787) init_firewall /opt/stack/neutron/neutron/agent/securitygroups_rpc.py:118}}
  neutron-openvswitch-agent[78787]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): DbAddCommand(table=Bridge, record=br-int, column=protocols, values=('OpenFlow10', 'OpenFlow11', 'OpenFlow12', 'OpenFlow13', 'OpenFlow14')) {{(pid=78787) do_commit /usr/local/lib/python3.8/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90}}
  neutron-openvswitch-agent[78787]: ERROR OfctlService [-] unknown dpid 90695823979334
  neutron-openvswitch-agent[78787]: ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [None req-b18a79b7-7258-44f0-9a69-fa92a490bc26 None None] ofctl request version=None,msg_type=None,msg_len=None,xid=None,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=71,type=1) error Datapath Invalid 90695823979334: os_ken.app.ofctl.exception.InvalidDatapath: Datapath Invalid 90695823979334
  neutron-openvswitch-agent[78787]: ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [None req-b18a79b7-7258-44f0-9a69-fa92a490bc26 None None] ofctl request version=None,msg_type=None,msg_len=None,xid=None,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=71,type=1) error Datapath Invalid 90695823979334 agent terminated!: RuntimeError: ofctl request version=None,msg_type=None,msg_len=None,xid=None,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=71,type=1) error Datapath Invalid 90695823979334
  systemd[1]: devstack@q-agt.service: Main process exited, code=exited, status=1/FAILURE
  systemd[1]: devstack@q-agt.service: Failed with result 'exit-code'.

  Originally, this was thought to be related to
  https://bugs.launchpad.net/neutron/+bug/1817022, however this is upon
  service startup on a relatively low load machine where the only action
  really is truly just neutron starting at that time. Also, starting,
  the connections have not been able to exist long enough for inactivity
  idle triggers to occur.

  Investigation into allowed us to identify the general path of what is
  occurring, yet why we don't understand, at least in the Ironic
  community.

  init_firewall() invocation: https://github.com/openstack/neutron/blob/79445f12be3a9ca892672fe0e016336ef60877a2/neutron/agent/securitygroups_rpc.py#L70
  Firewall class launch: https://github.com/openstack/neutron/blob/79445f12be3a9ca892672fe0e016336ef60877a2/neutron/agent/securitygroups_rpc.py#L121

  As the default for the firewall driver ends up sending us into
  openvswitch's firewall code:

  https://github.com/openstack/neutron/blob/79445f12be3a9ca892672fe0e016336ef60877a2/neutron/agent/linux/openvswitch_firewall/firewall.py#L548
  https://github.com/openstack/neutron/blob/79445f12be3a9ca892672fe0e016336ef60877a2/neutron/agent/linux/openvswitch_firewall/firewall.py#L628

  Which eventually ends up in
  https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py#L91
  where it raises a RuntimeError and the service exits out.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1944201/+subscriptions



References