Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Scroll Ignore
scroll-pdftrue
scroll-officetrue
scroll-chmtrue
scroll-docbooktrue
scroll-eclipsehelptrue
scroll-epubtrue
scroll-htmltrue
Panel
titleSommaire

Table of Contents
maxLevel4
stylenone


Description

Les logs de la rétention

Mongodb

MongoDB du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :

  • Gestion du module
  • Connexion à la base de données
  • Sauvegarde
  • Chargement
La suppression
  • Suppression des lignes
retentions obsolètes.
  • de rétention obsolètes.

Gestion du module

Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre 

Code Block
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ MANAGE SIGNAL ] The worker with the pid XXXX received a signal XX

Arrêt critique

Quand le processus de pilotage s'arrête de façon inopinée 

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER:XXXXX ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit.

Demande d'un dump de la mémoire

Le dump est fait

Python 2.6
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx
Python 2.7
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support)

Le dump a échoué

Python 2.6
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] MEMORY DUMP: FAIL check if guppy lib is installed
Python 2.7
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed

Connexion à la base de données

Dans les logs suivants, le mot clé SOUS-SECTION peut valoir une des valeurs suivantes :

  • LOAD RETENTION
  • DELETE OLD RETENTION
  • SAVE WORKER XXXXX

Connexion normale

Code Block
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s

La connexion échoue

Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try again
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed Y/X times, we will try again
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed X/X times, we stop trying

La connexion a été perdue ou n'existe pas

Code Block
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection

suivi des logs de la connexion normale

La connexion n'a pas pu être établie

Code Block
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection

Erreur de configuration du module

Si plusieurs url mongo sont précisées

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later
Paneltoc

Sauvegarde en rétention

Pour la sauvegarde de la rétention, trois sections types de logs existent: 

SectionDescription
SAVE GLOBALCorrespond au processus global de la sauvegarde
SAVE WORKERS
Correspond
Corresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde
SAVE WORKER XC'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du
scheduler
Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. (
voir 
voir la page Module MongodbRetention ( Rétention en base de donnée centralisée par royaume
( Module MongodbRetention
) )

SAVE GLOBAL

Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.

Code Block
titleExemple
[2019YYYY-07MM-10DD 14HH:34MM:39SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL            ] Starting to save retention data. [994XXX:hosts] [994XXX:checks] (Database used = mongodb://127.0.0.1HOST/?safe=false, use ssh = False)
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] SUCCESS Retention data was saved into mongodb. Total time X.XXs

Erreurs

Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme: 

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: ERROR MESSAGE. Total time XX.XXs. I disable it and set it to restart it later
Exemples
Code Block
languagebash
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb. Total time 22.20s. I disable it and set it to restart it later
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later

SAVE WORKERS

Les logs 

WORKERS

Les logs de la section SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.

Code Block
titleExemple
[2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Starting worker X with pid XXXXX. Try: X/X
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] Starting worker 0 with pid 14746. Try: 1/3
[2019-07-10 14:34:54] INFO   : [scheduler-master [ MongodbRetention ] [ SAVE WORKERS ] The worker X did SUCCESS (after X try)


La préparation des données à sauvegarder a été longue :

Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration


Des erreurs empêchent le bon déroulé de la sauvegarde :

Code Block
YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] some workers did fail to exit or encountered an error. The retention save can be incomplete

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Too many tries failed

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the XXXXX worker process as there is not enough memory

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] TheCannot start the worker 0 did SUCCESS (after 1 try) X process: XX. Exiting the retention save, killing all currently launched workers

SAVE WORKER X

Les logs de la section SAVE  SAVE WORKER X donne pour le worker ayant l'ididentifiant X,   les statistiques sur les sauvegardes qu'il a effectué effectuées : le nombre d'éléments, résultat et temps d'exécution.

Code Block
titleExemple
[2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ]  Retention data saved into mongodb in X.XXX seconds

Erreurs

Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it.
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Failed connection with the following message : ERROR MESSAGE
Perte de connexion à la base de données
Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [1/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [Y/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ Will save 249 hosts and 249 checks
[2019-07-10 14:34:54MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [X/X]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] After X tries, worker could not connect to mongo :[ERROR MESSAGE]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON"
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] Worker has an error: [ ERROR MESSAGE ]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON"

Chargement de la rétention

Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.


Code Block
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X    hosts/clusters  from the retention [ in scheduler hosts/clusters : without retention=X    / total=1    ]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X    hosts retention data).
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] [ X.XXXs ] We took X    checks          from the retention [ in scheduler checks         : without retention=XX   / total=XX   ]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] No checks       are needed for retention load (scheduler-master already have all X    checks retention data).
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs]  Total number of elements load from mongo database: X    ( scheduler have a total of XX   elements )
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] SUCCESS Retention data loaded successfully.

Erreurs

Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] FAILED Retention data could not be loaded from mongodb: ERROR MESSAGE DETAILS

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying host entries: ERROR MESSAGE. Module exiting.

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying checks entries: ERROR MESSAGE. Module exiting.

Suppression des anciennes rétentions

Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.

Code Block
titleExemple avec des objets à supprimer
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting XXX hosts from old retention [XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - XXX  - hosts deleted in X.XXXs
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting XXX services from old retention [XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - XXX  - services deleted in X.XXXs
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total time for deleting XXXX entries = X.XXXs
Code Block
titleExemple sans objets à supprimer
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] There is no data to delete
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total time for deleting 0 entries = X.XXXs

Erreur : perte de connexion à la base de données

Si une erreur survient pendant une opération en base de données, les logs suivants vont apparaître :

Code Block
titleExemple
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [1/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [2/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [3/3]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] After 3 tries, we couldn't connect to mongo
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have an error:[ERROR MESSAGE]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] "Exception Python"
 SAVE WORKER 0          ] SUCCESS did saved 249 hosts and 249 checks retention data into mongodb in 10.46s