| Scroll Ignore |
|---|
| scroll-pdf | true |
|---|
| scroll-office | true |
|---|
| scroll-chm | true |
|---|
| scroll-docbook | true |
|---|
| scroll-eclipsehelp | true |
|---|
| scroll-epub | true |
|---|
| scroll-html | true |
|---|
|
|
Les logs de la rétention Mongodb MongoDB du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :
- Gestion du module
- Connexion à la base de données
- Sauvegarde
- Chargement
- La suppression Suppression des lignes retentions de rétention obsolètes.
Pour la sauvegarde de la rétention, trois types de logs existent:
Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ MANAGE SIGNAL ] The worker with the pid XXXX received a signal XX |
Quand le processus de pilotage s'arrête de façon inopinée
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER:XXXXX ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit. |
Demande d'un dump de la mémoire
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support) |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] MEMORY DUMP: FAIL check if guppy lib is installed |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed |
Connexion à la base de données
Dans les logs suivants, le mot clé SOUS-SECTION peut valoir une des valeurs suivantes :
- LOAD RETENTION
- DELETE OLD RETENTION
- SAVE WORKER XXXXX
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try again
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed Y/X times, we will try again
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed X/X times, we stop trying |
La connexion a été perdue ou n'existe pas
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection |
suivi des logs de la connexion normale
La connexion n'a pas pu être établie
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection |
Erreur de configuration du module
Si plusieurs url mongo sont précisées
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later |
Pour la sauvegarde de la rétention, trois types de logs existent:
| Section | Description |
|---|
| SAVE GLOBAL | Correspond au processus global de la sauvegarde |
| SAVE WORKERS | Corresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde |
| SAVE WORKER X | C'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir la page Module MongodbRetention ( Rétention en base de donnée centralisée par royaume ) ) |
Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.
| Code Block |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] Starting to save retention data. [XXX:hosts] [XXX:checks] (Database used = mongodb://HOST/?safe=false, use ssh = False)
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] SUCCESS Retention data was saved into mongodb. Total time X.XXs |
Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: ERROR MESSAGE. Total time XX.XXs. I disable it and set it to restart it later
|
| Code Block |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb. Total time 22.20s. I disable it and set it to restart it later |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later |
Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Starting worker X with pid XXXXX. Try: X/X
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] The worker X did SUCCESS (after X try) |
La préparation des données à sauvegarder a été longue :
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration |
Des erreurs empêchent le bon déroulé de la sauvegarde :
| Code Block |
|---|
YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] some workers did fail to exit or encountered an error. The retention save can be incomplete
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Too many tries failed |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the XXXXX worker process as there is not enough memory |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the worker X process: XX. Exiting the retention save, killing all currently launched workers |
Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution
| Section | Description |
|---|
| SAVE GLOBAL | Correspond au processus global de la sauvegarde |
| SAVE WORKERS | Corresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde |
| SAVE WORKER X | C'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir Rétention en base de donnée centralisée par royaume ( Module MongodbRetention ) ) |
Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.
| Code Block |
|---|
|
[2019YYYY-07MM-10DD 14HH:34MM:39SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL SAVE WORKER 0 ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] Starting[ toSAVE save retention data. [994:hosts] [994:checks] (Database used = mongodb://127.0.0.1/?safe=false, use ssh = False) |
WORKER 0 ] Retention data saved into mongodb in X.XXX seconds |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it. |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS |
Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[2021-04-20 11:26:57] ERROR : [ scheduler-master ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb. Total time 22.20s. I disable it and set it to restart it later
|
Lors de la sauvegarde de la rétention, la base mongo est injoignable
| Code Block |
|---|
|
[2021-04-20 11:26:57] ERROR : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: ] [ SAVE GLOBALWORKER X ] Failed connection with ]the FAILEDfollowing Retentionmessage data: could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later |
Perte de connexion à la base de données
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [1/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME |
Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block |
|---|
|
[2019-07-10 14:34:44] INFO : [scheduler-master] [ MongodbRetention ] [ SAVE WORKERS ] Starting worker 0 with pid 14746. Try: 1/3
[2019-07-10 14:34:54] INFO : [scheduler-masterWORKER X ] worker has been disconnected of mongo. Will retry [Y/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKERS ] The worker 0 did SUCCESS (after 1 try) |
Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.
| Code Block |
|---|
|
[2019-07-10 14:34:44] INFO : [scheduler-masterWORKER X ] worker has been disconnected of mongo. Will retry [X/X]
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] 0After X tries, worker could not connect to mongo :[ERROR MESSAGE]
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON" |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR Will save 249 hosts and 249 checks
[2019-07-10 14:34:54] INFO : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER 0 ] SUCCESS did saved 249 hosts and 249 checks retention data into mongodb in 10.46s |
X ] Worker has an error: [ ERROR MESSAGE ]
[YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON" |
Chargement de la rétention
Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X hosts/clusters from the retention [ in scheduler hosts/clusters : without retention=X / total=1 ]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME |
En cas d'erreur, chaque worker essaiera de se lancer à nouveau en respectant le nombre de tentatives maximales définies dans le fichier de configuration du module. Si la rétention n'est pas sauvegardée après ces tentatives, le module sera en échec et le Scheduler s'arrêtera.
Lorsque le SSH tunnel est activé :
| Code Block |
|---|
|
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] Mongo connection failed 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 LOAD RETENTION ] Mongo[ connection failedHOSTS 1/1 times,CLUSTERS we] stop trying
[2021-04-20 13:24:20] ERROR : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] Worker has an error: [ Mongo connection failure : localhost:34925 ===(ssh tunnel)===> 192.168.1.132:22 ===(mongodb)===> 192.168.1.132:27017 ] |
Sans le SSH tunnel
| Code Block |
|---|
|
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] Mongo connection failed 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR : [ scheduler-masterNo host/cluster are needed for retention load (scheduler already have all X hosts retention data).
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS ] [ X.XXXs ] We took X checks from the retention [ in scheduler checks : without retention=XX / total=XX ]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 LOAD RETENTION ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 12:06:15] ERROR : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] Worker has an error: [ Mongo connection failure to mongodb://192.168.1.132/?safe=false ] |
Chargement de la rétention
Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.
| Code Block |
|---|
|
[2019-07-10 14:35:36[ CHECKS ] No checks are needed for retention load (scheduler already have all X checks retention data).
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs] Total number of elements load from ]mongo Startingdatabase: toX load the retention data( fromscheduler mongodb.have (Databasea usedtotal = mongodb://172.16.0.12/?safe=false, use ssh = Falseof XX elements )
[2019YYYY-07MM-10DD 14HH:35MM:37SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] WeSUCCESS foundRetention 994data hosts in the retention, took 0.374s.
[2019-07-10 14:35:37] INFO loaded successfully. |
Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] FAILED Retention data could not be loaded ]from Wemongodb: found 994 services in the retention, took 0.082s.
[2019-07-10 14:ERROR MESSAGE DETAILS
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR35:37] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying host entries: ERROR MESSAGE. Module exiting. |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] loaded[ inMongodbRetention 0.457s] ([ 1988LOAD elementsRETENTION )] whicherror werequerying savedchecks inentries: theERROR previousMESSAGE. Module exiting. |
Suppression des anciennes rétentions
Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.
| Code Block |
|---|
| title | Exemple avec des objets à supprimer |
|---|
|
[YYYY-MM-DD HH:MM:SSretention phase ( => 1988 elements currently managed by this scheduler. )
[2019-07-10 14:35:37] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE LOADOLD RETENTION ] We will delete all retention ]data SUCCESSthat Retentionwere datasaved loadedbefore successfully in 0.662s.
|
Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
|
[2019-07-10 16:19:10] ERRORthe XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE LOADOLD RETENTION ] The mongodb- connectionDeleting failedXXX tohosts initialise.from Weold cannotretention load the retention data
|
Suppression des anciennes rétentions
Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.
| Code Block |
|---|
| title | Exemple avec des objets à supprimer |
|---|
|
[2019-07-10 15:54:53[XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We will delete all retention data that were saved before the 2019-07-07 13:54 UTC (3 days)
[2019-07-10 15:54:53- XXX - hosts deleted in X.XXXs
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting 994XXX hostsservices from old retention [1000XXXX by 1000XXXX]
[2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - 994XXX - hostsservices deleted in 0X.188sXXXs
[2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting 994 services from old retention [1000 by 1000]
[2019-07-10 15:54:53Total time for deleting XXXX entries = X.XXXs
|
| Code Block |
|---|
| title | Exemple sans objets à supprimer |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We - 994 - services deleted in 0.091s
[2019-07-10 15:54:53will delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] TotalThere timeis forno deleting 1988 entries = 0.280s
|
| Code Block |
|---|
| title | Exemple sans objets à supprimer |
|---|
|
[2019-07-10 14:35:13data to delete
[YYYY-MM-DD HH:MM:SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] WeTotal willtime deletefor alldeleting retention0 data that were saved before the 2019-07-07 12:35 entries = X.XXXs
|
Erreur : perte de connexion à la base de données
Si une erreur survient pendant une opération en base de données, les logs suivants vont apparaître :
| Code Block |
|---|
|
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAMEUTC (3 days)
[2019-07-10 14:35:16] INFO : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have ]been Theredisconnected isof nomongo. dataWill to deleteretry [1/3]
[2019YYYY-07MM-10DD 14HH:35MM:16SS] INFO WARNING: [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We ]have Totalbeen timedisconnected forof deletingmongo. 0Will entries = 0.019s
|
Si une erreur est rencontrée lors de la suppression, elles seront indiquées dans les logs, comme ceux-ci:
| Code Block |
|---|
|
[2019-07-10 16:19:10] ERROR : [scheduler-masterretry [2/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [13/3]
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] After ]3 Wetries, havewe beencouldn't disconnectedconnect ofto mongo. Will retry [2/3]
[2019-07-10
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS 16:19:10] ERROR : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [3/3an error:[ERROR MESSAGE]
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION ] After 3 tries, we couldn't connect to mongo"Exception Python"
|