← Back to team overview

nagios-charmers team mailing list archive

[Bug 1914293] Re: Add support for failover nagios deployment

 

The third option makes much more sense to me. I don't think we should
tie a critical piece of software like Nagios to Juju leadership, my
feeling is that it doesn't offer enough guarantees. If we decide to go
that route, I'd strongly suggest confirming with the Juju devs that it's
a good idea.

-- 
You received this bug notification because you are a member of Nagios
Charm developers, which is subscribed to Nagios Charm.
https://bugs.launchpad.net/bugs/1914293

Title:
  Add support for failover nagios deployment

Status in Nagios Charm:
  New

Bug description:
  As Nagios is currently not cluster-aware, it is possible to deploy
  multiple nagios units which will both monitor and alert on the status
  of the environment through the notifications/pagerduty modules
  configured.

  I would like to see an option to detect whether multiple units of
  nagios are deployed and to have the Juju elected-leader have the
  external (pagerduty) notifications enabled and have them disabled on
  the non-leader unit with some way for the units to monitor and alert
  against each other not able to reach the notification APIs and taking
  over alerts for the other.  This is essentially to reduce double-
  incident alerting when running two nagios monitors.

  Another consideration is to have nagios non-leader sitting idle with
  nagios not running, and on leader-elected, starting nagios if it's the
  new leader.  This methodology would reduce the overhead of hitting all
  of the monitors on nrpe endpoints twice.

  Third option may be to tie into hacluster and configure it to manage
  the active nagios instance and VIP for the nagios web endpoint.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nagios/+bug/1914293/+subscriptions


References