Description

Les logs de la rétention Mongodb du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :

  • Sauvegarde
  • Chargement
  • La suppression des lignes retentions obsolètes.

Gestion du module

Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre 

[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] The worker with the pid XXXX received a signal XX


Arrêt critique


[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit.


Demande d'un dump de la mémoire

Le dump est fait

Python 2.6


[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx


Python 2.7


[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support)


Le dump a échoué

Python 2.6


[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] MEMORY DUMP: FAIL check if guppy lib is installed


Python 2.7


[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed


Connexion à la base de données

Connexion normale


[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True]
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s


La connexion échoue


[2021-04-21 10:24:49] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try again
[2021-04-21 10:24:49] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed Y/X times, we will try again
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed X/X times, we stop trying


La connexion a été perdue ou n'existe pas


[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection

suivi des logs de la connexion normale

La connexion n'a pas pu être établie


[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection


Erreur de configuration du module

Si plusieurs url mongo sont précisé

[2021-04-20 13:52:26] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later


Sauvegarde en rétention

Pour la sauvegarde de la rétention, trois types de logs existent: 

SectionDescription
SAVE GLOBALCorrespond au processus global de la sauvegarde
SAVE WORKERSCorresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde
SAVE WORKER XC'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir Rétention en base de donnée centralisée par royaume ( Module MongodbRetention ) )


SAVE GLOBAL

Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.

[2019-07-10 14:34:39] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL      ] Starting to save retention data. [XXX:hosts] [XXX:checks] (Database used = mongodb://HOST/?safe=false, use ssh = False)
[2019-07-10 14:34:39] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL      ] SUCCESS Retention data was saved into mongodb. Total time X.XXs


Erreurs

Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme: 

[2021-04-20 11:26:57] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: ERROR MESSAGE. Total time XX.XXs. I disable it and set it to restart it later


Exemples


[2021-04-20 11:26:57] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL      ] FAILED Retention data could not be saved in mongodb. Total time 22.20s. I disable it and set it to restart it later
[2021-04-20 11:26:57] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL      ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later


SAVE WORKERS

Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.

[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS                ] Starting worker X with pid XXXXX. Try: X/X
[2019-07-10 14:34:54] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS                ] The worker X did SUCCESS (after X try)


[2019-07-10 14:34:44] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration


[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS                ] some workers did fail to exit or encountered an error. The retention save can be incomplete.
[2019-07-10 14:34:44] ERROR  : Too many tries failed
[2019-07-10 14:34:44] ERROR  : Cannot start the XXXXX worker process as there is not enough memory
[2019-07-10 14:34:44] ERROR  : [SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS                ] Cannot start the worker X process: XX. Exiting the retention save, killing all currently launched workers


SAVE WORKER X

Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.

[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0          ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb
[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0          ]  Retention data saved into mongodb in X.XXX seconds


Erreurs


[2019-07-10 14:34:44] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X          ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it.


[2021-04-20 12:06:15] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X    ] Failed connection with the following message : ERROR MESSAGE


Chargement de la rétention

Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.


[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X    hosts/clusters  from the retention [ in scheduler hosts/clusters : without retention=X    / total=1    ]
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X    hosts retention data).
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] [ X.XXXs ] We took X    checks          from the retention [ in scheduler checks         : without retention=XX   / total=XX   ]
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] No checks       are needed for retention load (scheduler already have all X    checks retention data).
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs]  Total number of elements load from mongo database: X    ( scheduler have a total of XX   elements )
[2019-07-10 14:35:37] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] SUCCESS Retention data loaded successfully.


Erreurs

Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:

[2019-07-10 16:19:10] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION   ] FAILED Retention data could not be loaded from mongodb: ERROR MESSAGE DETAILS
[2019-07-10 16:19:10] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION   ] error querying host entries: ERROR MESSAGE. Module exiting.
[2019-07-10 16:19:10] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION   ] error querying checks entries: ERROR MESSAGE. Module exiting.


Suppression des anciennes rétentions

Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.

[2019-07-10 15:54:53] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[2019-07-10 15:54:53] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting XXX hosts from old retention [XXXX by XXXX]
[2019-07-10 15:54:53] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - XXX  - hosts deleted in X.XXXs
[2019-07-10 15:54:53] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting XXX services from old retention [XXXX by XXXX]
[2019-07-10 15:54:53] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - XXX  - services deleted in X.XXXs
[2019-07-10 15:54:53] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total time for deleting XXXX entries = X.XXXs


[2019-07-10 14:35:13] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[2019-07-10 14:35:16] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] There is no data to delete
[2019-07-10 14:35:16] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total time for deleting 0 entries = X.XXXs


Erreurs lors de la suppression d'anciennes données ou lors de la sauvegarde

Si une erreur est rencontrée lors de la suppression, elles seront indiquées dans les logs, comme ceux-ci:

[2019-07-10 16:19:10] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] We have been disconnected of mongo. Will retry [1/3]
[2019-07-10 16:19:10] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] We have been disconnected of mongo. Will retry [2/3]
[2019-07-10 16:19:10] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] We have been disconnected of mongo. Will retry [3/3]
[2019-07-10 16:19:10] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] After 3 tries, we couldn't connect to mongo


[2019-07-10 16:19:10] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] We have an error:[ERROR MESSAGE]
[2019-07-10 16:19:10] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] (stack du Traceback)
[2019-07-10 16:19:10] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] ...