| Scroll Ignore | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||
|
Description
Les logs de la rétention
MongoDB du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :
- Gestion du module
- Connexion à la base de données
- Sauvegarde
- Chargement
- Suppression des lignes
- de rétention obsolètes.
Gestion du module
Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ MANAGE SIGNAL ] The worker with the pid XXXX received a signal XX |
Arrêt critique
Quand le processus de pilotage s'arrête de façon inopinée
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER:XXXXX ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit. |
Demande d'un dump de la mémoire
Le dump est fait
Python 2.6
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx |
Python 2.7
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support) |
Le dump a échoué
Python 2.6
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] MEMORY DUMP: FAIL check if guppy lib is installed |
Python 2.7
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed |
Connexion à la base de données
Dans les logs suivants, le mot clé SOUS-SECTION peut valoir une des valeurs suivantes :
- LOAD RETENTION
- DELETE OLD RETENTION
- SAVE WORKER XXXXX
Connexion normale
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s
|
La connexion échoue
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try again
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed Y/X times, we will try again
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed X/X times, we stop trying |
La connexion a été perdue ou n'existe pas
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection |
suivi des logs de la connexion normale
La connexion n'a pas pu être établie
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection |
Erreur de configuration du module
Si plusieurs url mongo sont précisées
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later |
Sauvegarde en rétention
Pour la sauvegarde de la rétention, trois sections types de logs existent:
| Section | Description |
|---|---|
| SAVE GLOBAL | Correspond au processus global de la sauvegarde |
| SAVE WORKERS |
| Corresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde | |
| SAVE WORKER X | C'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du |
| Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( |
| voir la page Module MongodbRetention ( Rétention en base de donnée centralisée par royaume |
| ) ) |
SAVE GLOBAL
Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 14HH:34MM:39SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] Starting to save retention data. [994XXX:hosts] [994XXX:checks] (Database used = mongodb://127.0.0.1HOST/?safe=false, use ssh = False) [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] SUCCESS Retention data was saved into mongodb. Total time X.XXs |
Erreurs
Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: ERROR MESSAGE. Total time XX.XXs. I disable it and set it to restart it later
|
Exemples
| Code Block | ||
|---|---|---|
| ||
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb. Total time 22.20s. I disable it and set it to restart it later |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later |
SAVE WORKERS
Les logs
WORKERSLes logs de la section SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Starting worker X with pid XXXXX. Try: X/X [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] Starting worker 0 with pid 14746. Try: 1/3 [2019-07-10 14:34:54] INFO : [scheduler-master [ MongodbRetention ] [ SAVE WORKERS ] The worker X did SUCCESS (after X try) |
La préparation des données à sauvegarder a été longue :
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration |
Des erreurs empêchent le bon déroulé de la sauvegarde :
| Code Block |
|---|
YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] some workers did fail to exit or encountered an error. The retention save can be incomplete
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Too many tries failed |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the XXXXX worker process as there is not enough memory |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] TheCannot start the worker 0 did SUCCESS (after 1 try) X process: XX. Exiting the retention save, killing all currently launched workers |
SAVE WORKER X
Les logs de la section SAVE SAVE WORKER X donne pour le worker ayant l'ididentifiant X, les statistiques sur les sauvegardes qu'il a effectué effectuées : le nombre d'éléments, résultat et temps d'exécution.
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ] Retention data saved into mongodb in X.XXX seconds |
Erreurs
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it. |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Failed connection with the following message : ERROR MESSAGE |
Perte de connexion à la base de données
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [1/X] [YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [Y/X] [YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ Will save 249 hosts and 249 checks [2019-07-10 14:34:54MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [X/X] [YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] After X tries, worker could not connect to mongo :[ERROR MESSAGE] [YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON" |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] Worker has an error: [ ERROR MESSAGE ]
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON" |
Chargement de la rétention
Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X hosts/clusters from the retention [ in scheduler hosts/clusters : without retention=X / total=1 ] [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X hosts retention data). [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS ] [ X.XXXs ] We took X checks from the retention [ in scheduler checks : without retention=XX / total=XX ] [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS ] No checks are needed for retention load (scheduler-master already have all X checks retention data). [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs] Total number of elements load from mongo database: X ( scheduler have a total of XX elements ) [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] SUCCESS Retention data loaded successfully. |
Erreurs
Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] FAILED Retention data could not be loaded from mongodb: ERROR MESSAGE DETAILS
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying host entries: ERROR MESSAGE. Module exiting. |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying checks entries: ERROR MESSAGE. Module exiting. |
Suppression des anciennes rétentions
Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.
| Code Block | ||
|---|---|---|
| ||
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting XXX hosts from old retention [XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - XXX - hosts deleted in X.XXXs
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting XXX services from old retention [XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - XXX - services deleted in X.XXXs
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] Total time for deleting XXXX entries = X.XXXs
|
| Code Block | ||
|---|---|---|
| ||
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] There is no data to delete
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] Total time for deleting 0 entries = X.XXXs
|
Erreur : perte de connexion à la base de données
Si une erreur survient pendant une opération en base de données, les logs suivants vont apparaître :
| Code Block | ||
|---|---|---|
| ||
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [1/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [2/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [3/3]
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] After 3 tries, we couldn't connect to mongo
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have an error:[ERROR MESSAGE] [YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] "Exception Python" SAVE WORKER 0 ] SUCCESS did saved 249 hosts and 249 checks retention data into mongodb in 10.46s |