← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1513144] [NEW] Allow admin to mark agents down

 

Public bug reported:

Cloud administrators have monitoring systems externally placed watching
different types of resources of their cloud infrastructures. A cloud
infrastructure is comprehended not exclusively by an OpenStack instance
but also other components not managed by and possibly not visible to
OpenStack such as SDN controller, physical network elements, etc.

External systems may detect a fault on one of multiple of infrastructure
resources that subsequently may affect services being provided by
OpenStack. From a network perspective, an example of a fault can be the
crashing of openvswitch on a compute node.

When using the reference implementation (ovs + neutron-l2-agent),
neutron-l2-agent will continue reporting to the Neutron server its state
as alive (there's heartbeat; service's up ), although there's an
internal error caused by unreachability to the virtual bridge (br-int).
By means of external tools to OpenStack monitoring openvswitch, the
administrator knows there's something wrong and as a fault management
action he may want to explicitly set the agent state down.

Such action requires a new API exposed by Neutron allowing admins to set
(true/false) the aliveness state of Neutron agents.

This feature request goes in line with the work proposed to Nova [1] and
implemented in Liberty. The same is also being currently proposed to
Cinder [2]

[1] https://blueprints.launchpad.net/nova/+spec/mark-host-down
[2] https://blueprints.launchpad.net/cinder/+spec/mark-services-down

** Affects: neutron
     Importance: Undecided
     Assignee: Carlos Goncalves (cgoncalves)
         Status: New


** Tags: rfe

** Changed in: neutron
     Assignee: (unassigned) => Carlos Goncalves (cgoncalves)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1513144

Title:
  Allow admin to mark agents down

Status in neutron:
  New

Bug description:
  Cloud administrators have monitoring systems externally placed
  watching different types of resources of their cloud infrastructures.
  A cloud infrastructure is comprehended not exclusively by an OpenStack
  instance but also other components not managed by and possibly not
  visible to OpenStack such as SDN controller, physical network
  elements, etc.

  External systems may detect a fault on one of multiple of
  infrastructure resources that subsequently may affect services being
  provided by OpenStack. From a network perspective, an example of a
  fault can be the crashing of openvswitch on a compute node.

  When using the reference implementation (ovs + neutron-l2-agent),
  neutron-l2-agent will continue reporting to the Neutron server its
  state as alive (there's heartbeat; service's up ), although there's an
  internal error caused by unreachability to the virtual bridge (br-
  int). By means of external tools to OpenStack monitoring openvswitch,
  the administrator knows there's something wrong and as a fault
  management action he may want to explicitly set the agent state down.

  Such action requires a new API exposed by Neutron allowing admins to
  set (true/false) the aliveness state of Neutron agents.

  This feature request goes in line with the work proposed to Nova [1]
  and implemented in Liberty. The same is also being currently proposed
  to Cinder [2]

  [1] https://blueprints.launchpad.net/nova/+spec/mark-host-down
  [2] https://blueprints.launchpad.net/cinder/+spec/mark-services-down

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1513144/+subscriptions


Follow ups