Introduction

Shinken supports a feature that does freshness checking on the results of host and checks. The purpose of freshness checking is to ensure that host and service checks are being provided passively by external applications on a regular basis.

Freshness checking is useful when you want to ensure that passive checks are being received as frequently as you want. This can be very useful in distributed and failover monitoring environments.

How Does Freshness Checking Work?

 

Shinken periodically checks the freshness of the results for all hosts services that have freshness checking enabled.

An active check is executed even if active checks are disabled on a program-wide or host- or check-specific basis.

For example, if you have a freshness threshold of 60 for one of your checks, Shinken will consider that check to be stale if its last check result is older than 60 seconds.

Enabling Freshness Checking

Here's what you need to do to enable freshness checking.

If you do not specify a host- or check-specific freshness threshold value (or you set it to zero), Shinken will automatically calculate a threshold automatically, based on a how often you monitor that particular host or service. I would recommended that you explicitly specify a freshness threshold, rather than let Shinken pick one for you.

Example

An example of a check that might require freshness checking might be one that reports the status of your nightly backup jobs. Perhaps you have a external script that submit the results of the backup job to Shinken once the backup is completed. In this case, all of the checks/results for the service are provided by an external application using passive checks. In order to ensure that the status of the backup job gets reported every day, you may want to enable freshness checking for the check. If the external script doesn't submit the results of the backup job, you can have Shinken fake a critical result by doing something like this.

Here's what the definition for the check looks like:

PropertyValueNote
DescriptionBackup Job 
Active checks enabledFalseActive checks are NOT enabled
Passive checks enabledTruePassive checks are enabled (this is how results are reported)
Check freshnessTrue 
Freshness threshold93600 26 hour threshold, since backups may not always finish at the same time
Check commandno-backup-reportThis command is run only if the service results are “stale"


Notice that active checks are disabled. This is because the results for the check are only made by an external application using passive checks. Freshness checking is enabled and the freshness threshold has been set to 26 hours. This is a bit longer than 24 hours because backup jobs sometimes run late from day to day (depending on how much data there is to backup, how much network traffic is present, etc.). The no-backup-report command is executed only if the results of the service are determined to be stale. The definition of the no-backup-report command might look like this...

PropertyValue
Nameno-backup-report
Command Line/var/lib/shinken/libexec/check_dummy 2 "CRITICAL: Results of backup job were not reported!"


If Shinken detects that the service results are stale, it will run the no-backup-report command as an active check. This causes the check_dummy plugin to be executed, which returns a critical state to Shinken. The check will then go into to a Critical state (if it isn't already there) and someone will probably get notified of the problem.