Explication
Le module SLA permet de calculer les valeurs de SLA (Service Level Agreement) des éléments supervisés et les stocker dans la base de données Mongodb définie dans le fichier de configuration ci dessous. Il est également possible, via ce fichier, de modifier la méthode de calcul des SLA (par exemple, choisir de considérer un Warning comme une période positive de SLA, ou encore d'exclure les périodes de maintenance dans le calcul).
Configuration
Voici le fichier CFG de configuration présent dans : /etc/shinken/modules/sla.cfg
| Code Block | ||
|---|---|---|
| ||
#===============================================================================
# sla
#===============================================================================
# Daemons that can load this module:
# - broker (to save sla information into a mongodb database)
# Modules that can load this module:
# - WebUI (to display sla data to the users)
# This module compute and save SLA values into a mongodb database
#===============================================================================
define module {
# Shinken Enterprise. Lines added by import core. Do not remove it, it's used by Shinken Enterprise to update your objects if you re-import them.
_SE_UUID core-module-d05cd3505adb11e5884b080027f08538
_SE_UUID_HASH 05d3d1d1cce1f5e03b43936aad25e68f
# End of Shinken Enterprise part
#======== Module identity =========
# Module name. Must be unique
module_name sla
# Module type (to load module code). Do not edit.
module_type sla
#======== Database connection =========
# mongodb uri definition for connecting to the mongodb database. You can find the mongodb uri
# syntax at https://docs.mongodb.com/manual/reference/connection-string/
uri mongodb://localhost/?w=1&fsync=false
# WhichIf databaseyou is usedwant to storesecurize slayour data
mongodb connection you can databaseenable the ssh use_ssh_tunnel that will
# allow all mongodb shinken
to be encrypted & #usernameauthentificated with SSH
# Should usernameuse a SSH tunnel ;optional(Default 0=False)
#password# use_ssh_tunnel password ;optional0
#======== Module options =========
# Raw SLA can be kept during X days. In case of issue, these data will be used to re-perform SLA computation. If the SSH connection goes wrong, then retry use_ssh_retry_failure time before_shinken_inactive
# Default: 1
# use_ssh_retry_failure 1
# TheSSH drawbackuser/keyfile ofin thisorder featureto isconnect thatto itthe takes more disk spacemongodb server.
#keep_raw_sla_day 7# Default: shinken
# ssh_user ;optional, defaults to 7
# Durationshinken
in day to keep# SLA info,Default: ~shinken/.ssh/id_rsa
# Defaultssh_keyfile value is -1. It mean SLA are keep forever, in this case to mongo database will grow endlessly. ~shinken/.ssh/id_rsa
# MinimalTimeout valuein isorder 7to day
#nb_stored_days -1
establish a connection, in seconds
# Default: 10
# SLAmongo_timeout are computed10
on a daily basis.# SLAWhich ofdatabase theis currentused dayto arestore alwayssla recomputeddata
after a configuration change.database SLA from days before are by default not recomputed.
#shinken
If 1, old SLA will be recomputed with current settings. #======== Module options =========
# IfRaw 0, old SLA will not be recalculated [default]
# recompute_old_sla 0
SLA can be kept during X days. In case of issue, these data will be used to re-perform SLA computation.
#======== SLA calculation ========
# Some status can impact positively (counted as OK/UP), negatively (counted as CRITICAL/DOWN) or not impact the SLA
# (is not counted, meaning the period of study is reduced by the period that is not counted). The drawback of this feature is that it takes more disk space.
#keep_raw_sla_day 7 ;optional, defaults to 7
# Duration in day to keep SLA info,
# ThisDefault configurationvalue aimsis at-1. givingIt Shinkenmean administratorsSLA aare way keep forever, in this case to configuremongo howdatabase thewill SLAgrow are calculatedendlessly.
# IfMinimal 1,value Warningis counts7 as UPday
##nb_stored_days If 0, Warning counts as DOWN [default]
# warning_counts_as_ok 0
# == Unknown periods ==
# - include: Only status is considered. "Unknown" status is counted negatively in the SLA. [default] -1
# SLA are computed on a daily basis. SLA of the current day are always recomputed after a configuration change. SLA from days before are by default not recomputed.
# If 1, old SLA will be recomputed with current settings.
# If 0, old -SLA exclude:will not Unknownbe are not counted from SLA considered periodrecalculated [default]
# recompute_old_sla 0
#======== SLA calculation - ok:========
# Some status can impact Unknown are consideredpositively (counted as OK/UP), periods
negatively (counted as CRITICAL/DOWN) #or unknown_periodnot impact include
the SLA
# (is == No_data periods ("Missing data" and "Shinken inactive" status) ==
# - include: Only status is considered. "Missing data" and "Shinken inactive" status are counted negatively in the SLA. [default]not counted, meaning the period of study is reduced by the period that is not counted).
# This configuration aims at giving Shinken administrators a way to configure how the SLA are calculated.
# If 1, Warning -counts exclude:as UP
No_data are not counted# fromIf SLA0, consideredWarning period
counts as DOWN [default]
# - ok:# warning_counts_as_ok 0
No_data# are considered== asUnknown UPperiods periods==
# no_data_period - include
: Only status #is == Downtime periods ==
# - include: Only status is consideredconsidered. "Unknown" status is counted negatively in the SLA. [default]
# - exclude: DowntimesUnknown are not counted from SLA considered period
# - ok: DowntimesUnknown are considered as UP periods
# unknown_period - critical: Downtimes are considered asinclude
DOWN periods
# #== downtimeNo_perioddata periods ("Missing data" include
and "Shinken inactive" status) #======== INTERNAL options =========
# - #INTERNAL include: DO NOTOnly EDITstatus FOLLOWINGis PARAMETREconsidered. WITHOUT"Missing YOURdata" DEDICATEDand SUPPORT
"Shinken inactive" status are #counted ==negatively timein ofthe inactivation of the broker before considering that shinken is inactive (in sec) ==
#time_before_shinken_inactive 30SLA. [default]
# - exclude: No_data are not counted from SLA considered period
# == maximum number of- elementsok: archived in one bulk pass ==
No_data are #size_chunk_to_archiveconsidered as UP periods
# no_data_period 10 000include
# == time between two chunk to archiveDowntime periods ==
#time_between_two_chunks# - include: 0.1
Only status is # == default value of the interval check (in minutes) ==
#default_check_interval considered. [default]
# - exclude: Downtimes are not counted from SLA considered period
# 5
- ok: # == delay before theDowntimes creationare ofconsidered missingas dataUP periodperiods
(in check intervale) ==
# #margin_create_new_range 1.5- critical: Downtimes are considered as DOWN periods
# == max delay before creating missing data period (in minutes) ==
#margin_create_new_range_max 10
# Explanatory example of the property margin_create_new_rangedowntime_period include
#======== INTERNAL options =========
#INTERNAL : DO NOT EDIT FOLLOWING PARAMETRE WITHOUT YOUR DEDICATED SUPPORT
# For== antime elementof withinactivation aof checkthe intervalbroker atbefore 1minconsidering and margin_create_new_range at 1.5 which equals 1min30s of time delay.
# If the interval check is at 1h the delay would be at 1h30 but the delay is limited by margin_create_new_range_max which limits the delay to 10min.
#that shinken is inactive (in sec) ==
#time_before_shinken_inactive 30
# == maximum number of elements archived in one bulk pass ==
#size_chunk_to_archive 10 000
# An== OKtime statusbetween istwo givenchunk byto the scheduler at 12h30archive ==
# A new OK status is given by the scheduler at 12h40
# The scheduler should have given a new status at 12h31 but it gave it at 12h40 which is 9min of time delay.#time_between_two_chunks 0.1
# == default value of the interval check (in minutes) ==
#default_check_interval 5
# So== thatdelay 9minbefore >the 1min30screation aof missing data period (in is created.
}check intervale) ==
#margin_create_new_range 1.5
# == max delay before creating missing data period (in minutes) ==
#margin_create_new_range_max 10
# Explanatory example of the property margin_create_new_range
# For an element with a check interval at 1min and margin_create_new_range at 1.5 which equals 1min30s of time delay.
# If the interval check is at 1h the delay would be at 1h30 but the delay is limited by margin_create_new_range_max which limits the delay to 10min.
#
# An OK status is given by the scheduler at 12h30
# A new OK status is given by the scheduler at 12h40
# The scheduler should have given a new status at 12h31 but it gave it at 12h40 which is 9min of time delay.
# So that 9min > 1min30s a missing data period is created.
} |
Configurer l'accès à la base MongoDB
Cette configuration s'effectue dans le fichier de configuration du module SLA.
|
Les données SLA sont stockées dans la base Mongo locale au Broker
Pour se connecter au serveur Mongo utilisé pour le stockage des données SLA, 2 méthodes sont disponibles:
- Connexion directe: Par défaut, mais non sécurisée.
- Tunnel SSH: Shinken se connecte au serveur Mongo au travers d'un module SSH pour plus de sécurité
Connexion directe au serveur Mongo
Par défaut, le module SLA se connecte de manière directe au serveur Mongo pour y lire et écrire les données SLA.
Dans la configuration du module SLA, on sait que la connexion se fait de manière directe lorsque le paramètre "use_ssh_tunnel" est à 0.
/etc/shinken/modules/retention-mongodb.cfg
|
Cette méthode de connexion a pour avantage d'être facile à configurer au niveau de Shinken. Par contre, elle oblige à permettre l'accès à la base Mongo au monde extérieur, et donc s'exposer à des problèmes de sécurité.
La sécurisation de la base Mongo est bien sur toujours possible (voir Sécurisation des connexions aux bases MongoDB) mais bien plus complexe à mettre en place. La méthode de connexion par SSH est donc préférable pour des raisons pratiques et de sécurité.
(par défaut ~/.ssh/id_rsa)
Connexion par SSH au serveur Mongo
Le module SLA peut également se connecter par tunnel SSH au serveur Mongo, pour des raisons de sécurité.
- Dans la configuration du serveur Mongo (/etc/mongod.conf), assurez-vous que le paramètre "bind_ip" est positionné pour n'écouter que sur l'interface locale:
|
- Depuis le serveur hébergeant le Broker, assurez-vous que les clés publiques SSH de l'utilisateur lançant le daemon (par défaut "shinken") sont autorisées sur le serveur hébergeant Mongo :
- Connectez-vous avec le user lançant le démon sur le serveur Shinken
- Générez la paire de clés SSH si nécessaire
- Copiez la clé publique sur le serveur mongo
|
Si vous avez un serveur qui héberge à la fois le démon Broker et la base MongoDB (cas d'une installation standard), il vous faudra également appliquer ces commandes pour autoriser l'utilisateur shinken à se connecter automatiquement sur lui même en SSH
Modifiez la configuration du module SLA
- le paramètre "use_ssh_tunnel" doit être positionné à 1
- le paramètre "use_ssh_retry_failure" permet de spécifier le nombre supplémentaire de tentatives lors de l'établissement du tunnel SSH si ce dernier n'arrive pas à être établi.
- le paramètre "ssh_user" doit être positionné au user utilisé pour se connecter au serveur mongo (user_distant dans l'exemple précédent)
- le paramètre "ssh_keyfile" doit pointer vers la clé ssh privée sur le serveur Shinken