Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Format

...

The following things occur when hosts or services experience SOFT state changes:*

  • The SOFT state is logged.

...

  • Event handlers are executed to handle the SOFT state.

SOFT states are only logged if you enabled the :ref:`loglog_service_retries <configuration/configmain-advanced#log_service_retries>` or :ref:`logor log_host_retries <configuration/configmain-advanced#log_host_retries>` options in your main configuration file.

The only important thing that really happens during a soft state is the execution of event handlers. Using event handlers can be particularly useful if you want to try and proactively fix a problem before it turns into a HARD state. The :ref:`$HOSTSTATETYPE$ <$HOSTSTATETYPE$>` or :ref:`$SERVICESTATETYPE$ <$SERVICESTATETYPE$>` $HOSTSTATETYPE$ or $SERVICESTATETYPE$ macros will have a value of "SOFT" when event handlers are executed, which allows your event handler scripts to know when they should take corrective action. More information on event handlers can be found :ref:`here <advanced/eventhandlers>`.

...


Hard states occur for hosts and services in the following situations:*

  • When a host or service check results in a non-UP or non-OK state and it has been (re)checked the number of times specified by the max_check_attempts option in the host or service definition. This is a hard error state.

...

  • When a host or service transitions from one hard error state to another error state (e.g. WARNING to CRITICAL).

...

  • When a service check results in a non-OK state and its corresponding host is either DOWN or UNREACHABLE.

...

  • When a host or service recovers from a hard error state. This is considered to be a hard recovery.

...

  • When a

...

  • passive host check

...

  • is received. Passive host checks are treated as HARD

...

  • .

The following things occur when hosts or services experience HARD state changes:*

  • The HARD state is logged.

...

  • Event handlers are executed to handle the HARD state.

...

  • Contacts are notifified of the host or

...

  • check problem or recovery.

The :ref:`$HOSTSTATETYPE$ <$HOSTSTATETYPE$>` or :ref:`$SERVICESTATETYPE$ <$SERVICESTATETYPE$>` macros $HOSTSTATETYPE$ or$SERVICESTATETYPE$ data will have a value of "HARD" when event handlers are executed, which allows your event handler scripts to know when they should take corrective action. More information on event handlers can be found :ref:`here <advanced/eventhandlers>`.

Example


Here's an example of how state types are determined, when state changes occur, and when event handlers and notifications are sent out. The table below shows consecutive checks of a service check over time. The service check has a max_check_attempts value of 3.==== ======= ======== ========== ============ =============================================================================================================================================================================================================
Time Check # State State Type State Change Notes
0 1 OK HARD No Initial state of the service
1 1 CRITICAL SOFT Yes

TimeCheck NumberState TypeType StateChangesNotes
01OKHARDNoInitial state
11CRITICALSOFTYesFirst detection of a non-OK state. Event handlers execute. 
22WARNINGSOFTYes

...

Check continues to be in a non-OK state. Event handlers execute. 
33CRITICALHARDYesMax check attempts has been reached, so

...

check goes into a HARD state. Event handlers execute and a problem notification is sent out. Check

...

number is reset to 1 immediately after this happens.
41WARNINGHARDYes

...

Check changes to a HARD WARNING state. Event handlers execute and a problem notification is sent out.
51WARNINGHARDNo

...

Check stabilizes in a HARD problem state. Depending on what the notification interval for the service is, another notification might be sent out.
61OKHARDYes

...

Check experiences a HARD recovery. Event handlers execute and a recovery notification is sent out. 
71OKHARDNo

...

Check is still OK.
81UNKNOWNSOFTYes

...

Check is detected as changing to a SOFT non-OK state. Event handlers execute

...

92OKSOFTYes

...

Check experiences a SOFT recovery. Event handlers execute, but notification are not sent, as this wasn't a "real" problem. State type is set HARD and check

...

number is reset to 1 immediately after this happens.
101OKHARDNo

...

Check stabilizes in an OK state.

...