← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1821753] [NEW] openvswitch agent ofctl request errors: 'timed out' and 'Datapath Invalid'

 

Public bug reported:

Release: Queens, ovsdb_interface=native, of_request_timeout = 30

With number of OVS ports growing on the node following errors start to
occur (starting at ~1200 ports):

ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-db47426c-1719-43dd-8ecf-4fb4bdcbc316 - - - - -] ofctl request version=None,msg_type=None,msg_len=None,xid=None,OFPFlowMod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18), OFPActionSetField(tunnel_id=725), OFPActionOutput(len=16,max_len=0,port=1793,type=0), OFPActionOutput(len=16,max_len=0,port=2,type=0)],type=4)],match=OFPMatch(oxm_fields={'vlan_vid': 4175}),out_group=0,out_port=0,priority=1,table_id=22) error Datapath Invalid 64183592930369: InvalidDatapath: Datapath Invalid
 or 
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-632b8ede-1234-4682-afe0-3aefb615b121 - - - - -] ofctl request version=0x4,msg_type=0xe,msg_len=0x78,xid=0x73c67c07,OFPFlow
Mod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18), OFPActionSetField(tunnel_id=666), OFPActionOu
tput(len=16,max_len=0,port=2,type=0)],len=48,type=4)],match=OFPMatch(oxm_fields={'eth_dst': 'fa:16:3e:4a:79:ce', 'vlan_vid': 6107}),out_group=0,out_port=0,priority=2,table_id=20) timed out: Timeout: 30 seconds

with corresponding errors is ovs-vswitchd logs:

|rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 seconds, disconnecting
|rconn|ERR|br-floating<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 seconds, disconnecting
|rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 seconds, disconnecting


Setting inactivity_probe to a greater value helps:

#ovs-vsctl set controller br-int inactivity_probe=30000
#ovs-vsctl set controller br-tun inactivity_probe=30000
#ovs-vsctl set controller br-floating inactivity_probe=30000

Should neutron allow setting inactivity_probe for controllers?
Should it correspond to of_request_timeout value?

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1821753

Title:
  openvswitch agent ofctl request errors: 'timed out' and 'Datapath
  Invalid'

Status in neutron:
  New

Bug description:
  Release: Queens, ovsdb_interface=native, of_request_timeout = 30

  With number of OVS ports growing on the node following errors start to
  occur (starting at ~1200 ports):

  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-db47426c-1719-43dd-8ecf-4fb4bdcbc316 - - - - -] ofctl request version=None,msg_type=None,msg_len=None,xid=None,OFPFlowMod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18), OFPActionSetField(tunnel_id=725), OFPActionOutput(len=16,max_len=0,port=1793,type=0), OFPActionOutput(len=16,max_len=0,port=2,type=0)],type=4)],match=OFPMatch(oxm_fields={'vlan_vid': 4175}),out_group=0,out_port=0,priority=1,table_id=22) error Datapath Invalid 64183592930369: InvalidDatapath: Datapath Invalid
   or 
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-632b8ede-1234-4682-afe0-3aefb615b121 - - - - -] ofctl request version=0x4,msg_type=0xe,msg_len=0x78,xid=0x73c67c07,OFPFlow
  Mod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18), OFPActionSetField(tunnel_id=666), OFPActionOu
  tput(len=16,max_len=0,port=2,type=0)],len=48,type=4)],match=OFPMatch(oxm_fields={'eth_dst': 'fa:16:3e:4a:79:ce', 'vlan_vid': 6107}),out_group=0,out_port=0,priority=2,table_id=20) timed out: Timeout: 30 seconds

  with corresponding errors is ovs-vswitchd logs:

  |rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 seconds, disconnecting
  |rconn|ERR|br-floating<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 seconds, disconnecting
  |rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 seconds, disconnecting

  
  Setting inactivity_probe to a greater value helps:

  #ovs-vsctl set controller br-int inactivity_probe=30000
  #ovs-vsctl set controller br-tun inactivity_probe=30000
  #ovs-vsctl set controller br-floating inactivity_probe=30000

  Should neutron allow setting inactivity_probe for controllers?
  Should it correspond to of_request_timeout value?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1821753/+subscriptions