Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Scroll Ignore
scroll-pdftrue
scroll-officetrue
scroll-chmtrue
scroll-docbooktrue
scroll-eclipsehelptrue
scroll-epubtrue
scroll-htmltrue
Panel
titleSommaire

Table of Contents
maxLevel4
stylenone


Description

Les logs de la rétention Mongodb MongoDB du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :

  • Gestion du module
  • Connexion à la base de données
  • Sauvegarde
  • Chargement
  • La suppression Suppression des lignes retentions de rétention obsolètes.

Gestion du module

Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre 

Code Block
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ MANAGE SOUS-SECTIONSIGNAL ] The worker with the pid XXXX received a signal XX

Arrêt critique

Quand le processus de pilotage s'arrête de façon inopinée 

Code Block
[YYYY-MM-DD HH:MM:SS
Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONWORKER:XXXXX ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit.

Demande d'un dump de la mémoire

Le dump est fait

Python 2.6
Le dump est fait
Code Block
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION WORKER pid=XXXX ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx
Le dump a échoué
Python 2.7
Code Block
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] MEMORY DUMP: FAIL check if guppy lib is installed

Python 2.7

Le dump est fait
 WORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support)

Le dump a échoué

Python 2.6
Code Block
[YYYY-MM-DD HH:MM:SS
Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION WORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support)MEMORY DUMP: FAIL check if guppy lib is installed
Python 2.7
Le dump a échoué
Code Block
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION WORKER pid=XXXX ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed

Connexion à la base de données

Dans les logs suivants, le mot clé SOUS-SECTION peut valoir une des valeurs suivantes :

  • LOAD RETENTION
  • DELETE OLD RETENTION
  • SAVE WORKER XXXXX

Connexion normale

Code Block
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True]
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s

[2021-04-21 10:24:49] INFO   

La connexion échoue

Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection
Code Block
[2021-04-21 10:24:49] INFO   Mongo connection failed 1/X time, we will try again
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1Y/X timetimes, we will try again
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed YX/X times, we willstop trying

La connexion a été perdue ou n'existe pas

Code Block
[YYYY-MM-DD HH:MM:SS try again
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] MongoWe connectionneed failedto X/X times, we stop trying

[2021-04-21 10:create a mongo connection

suivi des logs de la connexion normale

La connexion n'a pas pu être établie

Code Block
[YYYY-MM-DD HH:MM:SS24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection

Erreur de configuration du module

Si plusieurs url mongo sont préciséprécisées

Code Block
[2021YYYY-04MM-20DD 13HH:52MM:26SS] ERROR  : [ SCHEDULERNAME scheduler-master ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later

Sauvegarde en rétention

Pour la sauvegarde de la rétention, trois types de logs existent: 

SectionDescription
SAVE GLOBALCorrespond au processus global de la sauvegarde
SAVE WORKERSCorresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde
SAVE WORKER XC'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir  ( voir la page Module MongodbRetention ( Rétention en base de donnée centralisée par royaume ( Module MongodbRetention ) )

SAVE GLOBAL

Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.

Code Block
titleExemple
[2019YYYY-07MM-10DD 14HH:34MM:39SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL            ] Starting to save retention data. [994XXX:hosts] [994XXX:checks] (Database used = mongodb://127.0.0.1HOST/?safe=false, use ssh = False)

Erreurs

Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme: 

Code Block
[2021-04-20 11:26:57] ERROR
[YYYY-MM-DD HH:MM:SS] INFO   : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER MongodbRetention ] The instance MongodbRetention raised an error: [ SAVE GLOBAL      ] FAILEDSUCCESS Retention data couldwas not be saved ininto mongodb. Total time 22.20s. I disable it and set it to restart it later
X.XXs

Erreurs

Les erreurs lors

Lors

de la sauvegarde de la rétention

, la base mongo est injoignable 

sont aussi enregistrées dans les logs sous cette forme: 

Code Block
[YYYY-MM-DD HH:MM:SS
Code Block
languagebash
[2021-04-20 11:26:57] ERROR  : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL      ] FAILED Retention data could not be saved in mongodb because mongo is unreachableERROR MESSAGE. Total time 2XX.11sXXs. I disable it and set it to restart it later

SAVE WORKERS

Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.

Code Block
titleExemple
[2019-07-10 14:34:44] INFO   : [scheduler-master] [ MongodbRetention ] [ SAVE WORKERS                ] Starting worker 0 with pid 14746. Try: 1/3
[2019-07-10 14:34:54] INFO   : [scheduler-master] [ MongodbRetention ] [ SAVE WORKERS                ] The worker 0 did SUCCESS (after 1 try)

Exemples
Code Block
languagebash
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb. Total time 22.20s. I disable it and set it to restart it later
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later

SAVE WORKERS

Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec

SAVE WORKER X

Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.

Code Block
titleExemple
[2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERWORKERS 0] Starting worker X with pid XXXXX.    ] Will save 249 hosts and 249 checks
[2019-07-10 14:34:54Try: X/X
[YYYY-MM-DD HH:MM:SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERWORKERS 0] The worker X did SUCCESS (after X try)


La préparation des données à sauvegarder a été longue :

Code Block
[YYYY-MM-DD  ] SUCCESS did saved 249 hosts and 249 checks retention data into mongodb in 10.46s

Erreurs

HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration


Des erreurs empêchent le bon déroulé de la sauvegarde :

Code Block
YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME

En cas d'erreur, chaque worker essaiera de se lancer à nouveau en respectant le nombre de tentatives maximales définies dans le fichier de configuration du module. Si la rétention n'est pas sauvegardée après ces tentatives, le module sera en échec et le Scheduler s'arrêtera.
Lorsque le SSH tunnel est activé :

Code Block
titleExemple d'erreurs
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKERS ] WORKERsome 0workers did fail to exit ]or Mongoencountered connectionan failed 1/1 time, we will try again
[2021-04-20 11:43:10error. The retention save can be incomplete

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ scheduler-masterSCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERWORKERS 0] Too   ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 13:24:20many tries failed

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ scheduler-masterSCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERWORKERS 0] Cannot start the ]XXXXX Workerworker hasprocess anas error:there [is Mongonot connection failure : localhost:34925 ===(ssh tunnel)===> 192.168.1.132:22 ===(mongodb)===> 192.168.1.132:27017 ]

Sans le SSH tunnel

enough memory

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ 
Code Block
titleExemple d'erreurs
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKERWORKERS 0] Cannot start the ]worker Mongo connection failed 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 12:06:15] ERROR  : [ scheduler-master X process: XX. Exiting the retention save, killing all currently launched workers

SAVE WORKER X

Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.

Code Block
titleExemple
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ] Updating  ] Worker has an error:retention with elements: checks [ XXX ] -- hosts [ MongoXX connection] failure to mongodb://192.168.1.132/?safe=false ]

Chargement de la rétention

in mongodb
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ]  Retention data saved into mongodb in X.XXX seconds

Erreurs

Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it.
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Failed connection with the following message : ERROR MESSAGE
Perte de connexion à la base de données
Code Block
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [1/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [Y/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [X/X]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] After X tries, worker could not connect to mongo :[ERROR MESSAGE]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON"
Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] Worker has an error: [ ERROR MESSAGE ]
[YYYY-MM-DD HH:MM:SS] ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON"

Chargement de la rétention

Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.


Code Block
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X    hosts/clusters  from the retention [ in scheduler hosts/clusters : without retention=X    / total=1    ]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X    hosts retention data).
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] [ X.XXXs ] We took X    checks          from the retention [ in scheduler checks         : without retention=XX   / total=XX   ]
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] No checks       are needed for retention load (scheduler already have all X    checks retention data).
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs]  Total number of elements load from mongo database: X    ( scheduler have a total of XX   elements )
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] SUCCESS Retention data loaded successfully.

Erreurs

Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME 

Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.

Code Block
titleExemple
[2019-07-10 14:35:36] INFO   : [scheduler-master] [ MongodbRetention ] [ LOAD RETENTION         ] Starting to load the retention data from mongodb. (Database used = mongodb://172.16.0.12/?safe=false, use ssh = False)
[2019-07-10 14:35:37] INFO   : [scheduler-master] [ MongodbRetention ] [ LOAD RETENTION ] FAILED Retention data could not be loaded ]from Wemongodb: foundERROR 994 hosts in the retention, took 0.374s.
[2019-07-10 14:35:37] INFO MESSAGE DETAILS

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION         ] Weerror foundquerying 994host servicesentries: inERROR theMESSAGE. retention, took 0.082s.
[2019-07-10 14:35:37] INFO Module exiting.

Code Block
[YYYY-MM-DD HH:MM:SS] ERROR  : [ SCHEDULERNAME   : [scheduler-master] [ MongodbRetention ] [ LOAD RETENTION ] error querying checks entries: ERROR MESSAGE. Module exiting.

Suppression des anciennes rétentions

Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.

Code Block
titleExemple avec des objets à supprimer
[YYYY-MM-DD HH:MM:SS   ] loaded in 0.457s ( 1988 elements ) which were saved in the previous retention phase ( => 1988 elements currently managed by this scheduler. )
[2019-07-10 14:35:37] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION         ] SUCCESS Retention data loaded successfully in 0.662s.

Erreurs

Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:

Code Block
titleExemple
[2019-07-10 16:19:10] ERROR  : [scheduler-master] [ MongodbRetention ] [ LOAD RETENTION         ] The mongodb connection failed to initialise. We cannot load the retention data

Suppression des anciennes rétentions

Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.

Code Block
titleExemple avec des objets à supprimer
[2019-07-10 15:54:53DELETE OLD RETENTION   ] We will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting XXX hosts from old retention [XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved- beforeXXX the 2019-07-07 13:54hosts UTCdeleted (3 days)in X.XXXs
[2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting 994XXX hostsservices from old retention [1000XXXX by 1000XXXX]
[2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - 994XXX  - hostsservices deleted in 0X.188sXXXs
[2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total -time Deletingfor 994deleting servicesXXXX fromentries old retention [1000 by 1000]
[2019-07-10 15:54:53= X.XXXs
Code Block
titleExemple sans objets à supprimer
[YYYY-MM-DD HH:MM:SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will  - 994  - services deleted in 0.091s
[2019-07-10 15:54:53delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] TotalThere timeis forno deletingdata 1988 entries = 0.280s
Code Block
titleExemple sans objets à supprimer
[2019-07-10 14:35:13to delete
[YYYY-MM-DD HH:MM:SS] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved before the 2019-07-07 12Total time for deleting 0 entries = X.XXXs

Erreur : perte de connexion à la base de données

Si une erreur survient pendant une opération en base de données, les logs suivants vont apparaître :

Code Block
titleExemple
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME:35 UTC (3 days)
[2019-07-10 14:35:16] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have ]been Theredisconnected isof nomongo. dataWill to deleteretry [1/3]
[2019YYYY-07MM-10DD 14HH:35MM:16SS] INFO   WARNING: [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We ]have Total timebeen fordisconnected deletingof 0mongo. entriesWill = 0.019s

Erreurs

Si une erreur est rencontrée lors de la suppression, elles seront indiquées dans les logs, comme ceux-ci:

Code Block
titleExemple
[2019-07-10 16:19:10] ERROR  : [scheduler-masterretry [2/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We have been disconnected of mongo. Will retry [13/3]
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR  : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] After ]3 Wetries, havewe beencouldn't disconnectedconnect ofto mongo. Will retry [2/3]
[2019-07-10
Code Block
[YYYY-MM-DD HH:MM:SS 16:19:10] ERROR  : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We have been disconnected of mongo. Will retry [3/3an error:[ERROR MESSAGE]
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR  : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION   ] After 3 tries, we couldn't connect to mongo"Exception Python"