Monitoring with Nagios
Created by Emily Backes, last modified by Greg Noe on Jul 23, 2015
Monitoring a service is one of the more important parts of keeping it running reliably; Nagios is one of the leading opensource (with commercial support) options for extensible monitoring of networks, hosts, and services. OpenLDAP has a variety of monitorable features that can help with proactive diagnosis of trouble.
There are several OpenLDAP monitoring scripts for Nagios out in-the-wild, but last time we checked they did not cover important cases like multi-master replication. Symas has plans to develop example monitoring tools for Nagios to be bundled with our product.
Documentation
The Nagios Manual is online.
Features that should be monitored as a Nagios service
- Server listening - RootDSE query
- Database available - Suffix query/queries of content DBs
- Replication current - Analysis of contextCSN state between servers
- Monitor health - Connection count, etc.
Features that should be implemented as an NRPE plugin
- BDB health checks
- MDB health checks
- Log watching for e.g.
- Authentication failure patterns (see fail2ban)
- Unindexed and slow searches
- Problems with back-ldap proxy targets
- Hardware failure events noticed by slapd
Features covered by existing plugins
- Disk space
- Memory usage (might be better implemented by us, judging by past tickets)
- Similar OS-level details