← Back to team overview

nagios-charmers team mailing list archive

[Bug 1842039] Re: some checks coming from the nrpe charm relation are ignored

 

Moreover:

$ juju config nrpe-kubernetes-worker-gpu export_nagios_definitions=true
$ juju run --unit nrpe-kubernetes-worker-gpu/2 -- ls -l /var/lib/nagios/export
total 20
-rw-r--r-- 1 root root 275 Aug 30 09:45 host__juju-k8s-kubernetes-worker-gpu-6.cfg
-rw-r--r-- 1 root root 496 Aug 30 08:40 service__juju-k8s-kubernetes-worker-gpu-6_check_docker.cfg
-rw-r--r-- 1 root root 528 Aug 30 08:40 service__juju-k8s-kubernetes-worker-gpu-6_check_snap.kube-proxy.daemon.cfg
-rw-r--r-- 1 root root 522 Aug 30 08:40 service__juju-k8s-kubernetes-worker-gpu-6_check_snap.kubelet.daemon.cfg
-rw-r--r-- 1 root root 481 Aug 29 14:34 service__juju-kubernetes-worker-gpu-6_check_flannel.cfg
$ juju run --unit nrpe-kubernetes-worker-gpu/2 -- cat /var/lib/nagios/export/* 
#---------------------------------------------------
# This file is Juju managed
#--------------------------------------------------

define host {
    address     10.1.2.3
    host_name   juju-k8s-kubernetes-worker-gpu-6
    use         server
    hostgroups  machines, 
}
#---------------------------------------------------
# This file is Juju managed
#---------------------------------------------------
define service {
    use                             active-service
    host_name                       juju-k8s-kubernetes-worker-gpu-6
    service_description             juju-k8s-kubernetes-worker-gpu-6[docker] process check {kubernetes-worker-gpu/6}
    check_command                   check_nrpe!check_docker
    servicegroups                   juju-k8s
}

#---------------------------------------------------
# This file is Juju managed
#---------------------------------------------------
define service {
    use                             active-service
    host_name                       juju-k8s-kubernetes-worker-gpu-6
    service_description             juju-k8s-kubernetes-worker-gpu-6[snap.kube-proxy.daemon] process check {kubernetes-worker-gpu/6}
    check_command                   check_nrpe!check_snap.kube-proxy.daemon
    servicegroups                   juju-k8s
}

#---------------------------------------------------
# This file is Juju managed
#---------------------------------------------------
define service {
    use                             active-service
    host_name                       juju-k8s-kubernetes-worker-gpu-6
    service_description             juju-k8s-kubernetes-worker-gpu-6[snap.kubelet.daemon] process check {kubernetes-worker-gpu/6}
    check_command                   check_nrpe!check_snap.kubelet.daemon
    servicegroups                   juju-k8s
}

#---------------------------------------------------
# This file is Juju managed
#---------------------------------------------------
define service {
    use                             active-service
    host_name                       juju-kubernetes-worker-gpu-6
    service_description             juju-kubernetes-worker-gpu-6[flannel] process check {juju:flannel-gpu/3}
    check_command                   check_nrpe!check_flannel
    servicegroups                   juju
}


** Also affects: nrpe-charm
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Nagios
Charm developers, which is subscribed to Nagios Charm.
https://bugs.launchpad.net/bugs/1842039

Title:
  some checks coming from the nrpe charm relation are ignored

Status in Nagios Charm:
  New
Status in NRPE Charm:
  New

Bug description:
  In a kubernetes model, we have some kubernetes-worker charms
  (kubernetes-worker-gpu) with a relation to an nrpe charm (nrpe-
  kubernetes-worker-gpu). The nrpe charm has a relation to a nagios
  charm (nagios-server-k8s).

  If we list the nrpe-checks it looks like we have a few:

  $ juju run-action --wait nrpe-kubernetes-worker-gpu/2 list-nrpe-checks
  unit-nrpe-kubernetes-worker-gpu-2:
    id: a806dd9c-dd05-49af-8635-aef467123586
    results:
      checks:
        check-arp-cache: /usr/local/lib/nagios/plugins/check_arp_cache.py -w 60 -c 80
        check-conntrack: /usr/local/lib/nagios/plugins/check_conntrack.sh -w 80 -c 90
        check-disk-root: '/usr/lib/nagios/plugins/check_disk -u GB -w 25% -c 20% -K
          5% -p / '
        check-docker: /usr/local/lib/nagios/plugins/check_systemd.py docker
        check-flannel: /usr/local/lib/nagios/plugins/check_systemd.py flannel
        check-load: /usr/lib/nagios/plugins/check_load -w 320,160,80 -c 640,320,160
        check-mem: /usr/local/lib/nagios/plugins/check_mem.pl -C -h -u -w 85 -c 90
        check-snap:
          kube-proxy:
            daemon: /usr/local/lib/nagios/plugins/check_systemd.py snap.kube-proxy.daemon
          kubelet:
            daemon: /usr/local/lib/nagios/plugins/check_systemd.py snap.kubelet.daemon
      timestamp: Fri Aug 30 10:33:47 CEST 2019
    status: completed
    timing:
      completed: 2019-08-30 10:33:48 +0200 CEST
      enqueued: 2019-08-30 10:33:46 +0200 CEST
      started: 2019-08-30 10:33:47 +0200 CEST
    unit: nrpe-kubernetes-worker-gpu/2

  But we don't find all of them in nagios:

  $ juju run --unit nagios-server-k8s/6 -- grep -r "service.*kubernetes-worker-gpu" /etc/nagios3/conf.d/
  /etc/nagios3/conf.d/charm.cfg:	 service_description            juju-k8s-kubernetes-worker-gpu-6-check_arp_cache
  /etc/nagios3/conf.d/charm.cfg:	 service_description            juju-k8s-kubernetes-worker-gpu-6-check_mem
  /etc/nagios3/conf.d/charm.cfg:	 service_description            juju-k8s-kubernetes-worker-gpu-6-check_conntrack
  /etc/nagios3/conf.d/charm.cfg:	 service_description            juju-k8s-kubernetes-worker-gpu-6-check_load
  /etc/nagios3/conf.d/charm.cfg:	 service_description            juju-k8s-kubernetes-worker-gpu-6-docker
  /etc/nagios3/conf.d/charm.cfg:	 service_description            juju-k8s-kubernetes-worker-gpu-6-check_disk_root

To manage notifications about this bug go to:
https://bugs.launchpad.net/nagios-charm/+bug/1842039/+subscriptions


References