← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1663458] [NEW] brutal stop of ovs-agent doesn't kill ryu controller

 

Public bug reported:

It seems like when we kill neutron-ovs-agent and start it again, the ryu
controller fails to start because the previous instance (in eventlet) is
still running.

(... ovs agent is failing to start and is brutally killed)

Trying to start the process 5 minutes later:
INFO neutron.common.config [-] /usr/bin/neutron-openvswitch-agent version 10.0.0.0rc2.dev33
INFO ryu.base.app_manager [-] loading app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp
INFO ryu.base.app_manager [-] loading app ryu.app.ofctl.service
INFO ryu.base.app_manager [-] loading app ryu.controller.ofp_handler
INFO ryu.base.app_manager [-] instantiating app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp of OVSNeutronAgentRyuApp
INFO ryu.base.app_manager [-] instantiating app ryu.controller.ofp_handler of OFPHandler
INFO ryu.base.app_manager [-] instantiating app ryu.app.ofctl.service of OfctlService
ERROR ryu.lib.hub [-] hub: uncaught exception: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/ryu/controller/controller.py", line 97, in __call__
    self.ofp_ssl_listen_port)
  File "/usr/lib/python2.7/site-packages/ryu/controller/controller.py", line 120, in server_loop
    datapath_connection_factory)
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 117, in __init__
    self.server = eventlet.listen(listen_info)
  File "/usr/lib/python2.7/site-packages/eventlet/convenience.py", line 43, in listen
    sock.bind(addr)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 98] Address already in use
INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected
INFO neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_bridge [-] Bridge br-int has datapath-ID 0000badb62a6184f
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [-] Switch connection timeout

I haven't figured out yet how the previous instance of ovs agent was
killed (my theory is that Puppet killed it but I don't have the killing
code yet, I'll update the bug asap).

** Affects: neutron
     Importance: Undecided
     Assignee: Ihar Hrachyshka (ihar-hrachyshka)
         Status: New

** Affects: tripleo
     Importance: Critical
     Assignee: Emilien Macchi (emilienm)
         Status: Triaged


** Tags: needs-attention ovs

** Also affects: tripleo
   Importance: Undecided
       Status: New

** Changed in: tripleo
       Status: New => Triaged

** Changed in: tripleo
     Assignee: (unassigned) => Emilien Macchi (emilienm)

** Changed in: tripleo
    Milestone: None => ocata-rc1

** Changed in: tripleo
   Importance: Undecided => Critical

** Tags added: alert ci

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1663458

Title:
  brutal stop of ovs-agent doesn't kill ryu controller

Status in neutron:
  New
Status in tripleo:
  Triaged

Bug description:
  It seems like when we kill neutron-ovs-agent and start it again, the
  ryu controller fails to start because the previous instance (in
  eventlet) is still running.

  (... ovs agent is failing to start and is brutally killed)

  Trying to start the process 5 minutes later:
  INFO neutron.common.config [-] /usr/bin/neutron-openvswitch-agent version 10.0.0.0rc2.dev33
  INFO ryu.base.app_manager [-] loading app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp
  INFO ryu.base.app_manager [-] loading app ryu.app.ofctl.service
  INFO ryu.base.app_manager [-] loading app ryu.controller.ofp_handler
  INFO ryu.base.app_manager [-] instantiating app neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp of OVSNeutronAgentRyuApp
  INFO ryu.base.app_manager [-] instantiating app ryu.controller.ofp_handler of OFPHandler
  INFO ryu.base.app_manager [-] instantiating app ryu.app.ofctl.service of OfctlService
  ERROR ryu.lib.hub [-] hub: uncaught exception: Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch
      return func(*args, **kwargs)
    File "/usr/lib/python2.7/site-packages/ryu/controller/controller.py", line 97, in __call__
      self.ofp_ssl_listen_port)
    File "/usr/lib/python2.7/site-packages/ryu/controller/controller.py", line 120, in server_loop
      datapath_connection_factory)
    File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 117, in __init__
      self.server = eventlet.listen(listen_info)
    File "/usr/lib/python2.7/site-packages/eventlet/convenience.py", line 43, in listen
      sock.bind(addr)
    File "/usr/lib64/python2.7/socket.py", line 224, in meth
      return getattr(self._sock,name)(*args)
  error: [Errno 98] Address already in use
  INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
  INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected
  INFO neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_bridge [-] Bridge br-int has datapath-ID 0000badb62a6184f
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [-] Switch connection timeout

  I haven't figured out yet how the previous instance of ovs agent was
  killed (my theory is that Puppet killed it but I don't have the
  killing code yet, I'll update the bug asap).

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1663458/+subscriptions


Follow ups