Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Scroll Ignore
scroll-pdftrue
scroll-officetrue
scroll-chmtrue
scroll-docbooktrue
scroll-eclipsehelptrue
scroll-epubtrue
scroll-htmltrue


Panel
titleSommaire

Table of Contents
maxLevel4
stylenone



Description

Les logs de la rétention Mongodb du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :

  • Sauvegarde
  • Chargement
  • La suppression des lignes retentions obsolètes.

Gestion du module


Code Block
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] The worker with the pid XXXX received a signal XX



Arrêt critique


Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit.


Demande d'un dump de la mémoire

Le dump est fait

Python 2.6


Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx


Python 2.7


Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support)


Le dump a échoué

Python 2.6


Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] MEMORY DUMP: FAIL check if guppy lib is installed


Python 2.7


Code Block
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed


Connexion à la base de données

Connexion normale


Code Block
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True]
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s


La connexion échoue


Code Block
[2021-04-21 10:24:49] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try again
[2021-04-21 10:24:49] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed Y/X times, we will try again
[2021-04-21 10:24:49] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed X/X times, we stop trying


La connexion a été perdue ou n'existe pas


Code Block
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection

suivi des logs de la connexion normale

La connexion n'a pas pu être établie


Code Block
[2021-04-21 10:24:49] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection


Erreur de configuration du module

Si plusieurs url mongo sont précisé

Code Block
[2021-04-20 13:52:26] ERROR  : [ scheduler-master ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later


Sauvegarde en rétention

Pour la sauvegarde de la rétention, trois types de logs existent: 

SectionDescription
SAVE GLOBALCorrespond au processus global de la sauvegarde
SAVE WORKERSCorresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde
SAVE WORKER XC'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir Rétention en base de donnée centralisée par royaume ( Module MongodbRetention ) )


SAVE GLOBAL

Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.

Code Block
titleExemple
[2019-07-10 14:34:39] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL            ] Starting to save retention data. [994XXX:hosts] [994XXX:checks] (Database used = mongodb://127.0.0.1HOST/?safe=false, use ssh = False)

Erreurs

Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme: 

Code Block
[2021-04-20 11:26:57] ERROR
[2019-07-10 14:34:39] INFO   : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER MongodbRetention ] The instance MongodbRetention raised an error: [ SAVE GLOBAL      ] FAILEDSUCCESS Retention data couldwas not be saved ininto mongodb. Total time 22.20s. I disable it and set it to restart it later
X.XXs


Erreurs

Les erreurs lors

Lors

de la sauvegarde de la rétention

, la base mongo est injoignable 

sont aussi enregistrées dans les logs sous cette forme: 

bash
Code Block
Code Block
language
[2021-04-20 11:26:57] ERROR  : [ scheduler-master ] [ MODULES-MANAGER  ] The instance MongodbRetention raised an error: [ SAVE GLOBAL      ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 222.11s20s. I disable it and set it to restart it later

SAVE WORKERS



Lors de la sauvegarde de la rétention, la base mongo est injoignable 

Code Block
languagebash
[2021-04-20 11:26:57] ERROR

Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.

Code Block
titleExemple
[2019-07-10 14:34:44] INFO   : [ scheduler-master ] [ MODULES-MANAGER  ] The instance MongodbRetention ] raised an error: [ SAVE WORKERSGLOBAL      ] FAILED Retention data could not be saved in mongodb ]because Startingmongo workeris 0unreachable. withTotal pidtime 147462.11s. I disable it and set it to restart it later


SAVE WORKERS

Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.

Code Block
titleExemple
Try: 1/3
[2019-07-10 14:34:5444] INFO   : [scheduler-master] [ MongodbRetention ] [ SAVE WORKERS                ] TheStarting worker 0 with didpid SUCCESS (after 1 try)

SAVE WORKER X

Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.

Code Block
titleExemple
14746. Try: 1/3
[2019-07-10 14:34:4454] INFO   : [scheduler-masterSCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERWORKERS 0          ] Will save 249 hosts and 249 checks
   ] The worker X did SUCCESS (after X try)


Code Block
[2019-07-10 14:34:5444] INFOWARNING: [ SCHEDULERNAME : [scheduler-master] [ MongodbRetention ] [ SAVEPERF WORKER] 0[ X.XXXs ] atomization duration


Code Block
[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS                ] SUCCESSsome workers did saved 249 hosts and 249 checksfail to exit or encountered an error. The retention datasave intocan mongodbbe in 10.46s

Erreurs

incomplete.
[2019-07-10 14:34:44] ERROR  : Too many tries failed
[2019-07-10 14:34:44] ERROR  : Cannot start the XXXXX worker process as there is not enough memory
[2019-07-10 14:34:44] ERROR  : [SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS                ] Cannot start the worker X process: XX. Exiting the retention save, killing all currently launched workers


SAVE WORKER X

Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.

Code Block
titleExemple
[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0          ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb
[2019-07-10 14:34:44] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0          ]  Retention data saved into mongodb in X.XXX seconds
[2019-07-10 14:34:44] INFO   : [scheduler-master] [ MongodbRetention ] [ SAVE WORKER 0          ] Will save 249 hosts and 249 checks
[2019-07-10 14:34:54] INFO   : [scheduler-master] [ MongodbRetention ] [ SAVE WORKER 0          ] SUCCESS did saved 249 hosts and 249 checks retention data into mongodb in 10.46s


Erreurs


Code Block
[2019-07-10 14:34:44] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X          ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it.


En cas d'erreur, chaque worker essaiera de se lancer à nouveau en respectant le nombre de tentatives maximales définies dans le fichier de configuration du module. Si la rétention n'est pas sauvegardée après ces tentatives, le module sera en échec et le Scheduler s'arrêtera.
Lorsque le SSH tunnel est activé :

Code Block
titleExemple d'erreurs
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 13:24:20] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Worker has an error: [ Mongo connection failure : localhost:34925 ===(ssh tunnel)===> 192.168.1.132:22 ===(mongodb)===> 192.168.1.132:27017 ]


Sans le SSH tunnel

Code Block
titleExemple d'erreurs
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 12:06:15] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Worker has an error: [ Mongo connection failure to mongodb://192.168.1.132/?safe=false ]


Code Block
[2021-04-20 12:06:15] ERROR  : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0    ] Failed connection with the following message : ERROR MESSAGE


Chargement de la rétention

Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.


Code Block
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME

En cas d'erreur, chaque worker essaiera de se lancer à nouveau en respectant le nombre de tentatives maximales définies dans le fichier de configuration du module. Si la rétention n'est pas sauvegardée après ces tentatives, le module sera en échec et le Scheduler s'arrêtera.
Lorsque le SSH tunnel est activé :

Code Block
titleExemple d'erreurs
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVELOAD WORKER 0    RETENTION ] Mongo[ connectionHOSTS failed/ 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 13:24:20] ERRORCLUSTERS ] [ X.XXXs ] We took X    hosts/clusters  from the retention [ in scheduler hosts/clusters : without retention=X    / total=1    ]
[2019-07-10 16:19:10] INFO   : [ scheduler-masterSCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0    ] Worker has an error: [ Mongo connection failure : localhost:34925 ===(ssh tunnel)===> 192.168.1.132:22 ===(mongodb)===> 192.168.1.132:27017 ]

Sans le SSH tunnel

Code Block
titleExemple d'erreurs
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0    ] Mongo connection failed 1/1 time, we will try again
[2021-04-20 11:43:10] ERROR  : [ scheduler-master LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X    hosts retention data).
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS           ] [ X.XXXs ] We took X    checks          from the retention [ in scheduler checks         : without retention=XX   / total=XX   ]
[2019-07-10 16:19:10] INFO   : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVELOAD WORKER 0    ] Mongo connection failed 1/1 times, we stop trying
[2021-04-20 12:06:15] ERROR  : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0RETENTION ] [ CHECKS           ] No checks    ] Worker has anare error:needed [for Mongoretention connectionload failure(scheduler to mongodb://192.168.1.132/?safe=false ]

Chargement de la rétention

Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.

Code Block
titleExemple
already have all X    checks retention data).
[2019-07-10 1416:3519:3610] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs]  Total number of elements load from ]mongo Startingdatabase: toX load the retention data( fromscheduler mongodb.have (Databasea usedtotal = mongodb://172.16.0.12/?safe=false, use ssh = Falseof XX   elements )
[2019-07-10 14:35:37] INFO   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [       X.XXXs ] We foundSUCCESS Retention data loaded successfully.


Erreurs

Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:


Code Block
 994 hosts in the retention, took 0.374s.
[2019-07-10 1416:3519:3710] INFOERROR   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION   ] FAILED Retention data could not ]be We found 994 services in the retention, took 0.082s.loaded from mongodb: ERROR MESSAGE DETAILS
[2019-07-10 1416:3519:3710] INFOERROR   : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION   ] error querying host entries: ERROR MESSAGE. Module exiting.
[2019-07-10 16:19:10] ERROR loaded in: 0.457s[ (SCHEDULERNAME 1988] elements[ )MongodbRetention which] were[ savedLOAD inRETENTION the previous retention] phaseerror (querying =>checks 1988entries: elementsERROR currentlyMESSAGE. Module exiting.


Suppression des anciennes rétentions

Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.

Code Block
titleExemple avec des objets à supprimer
managed by this scheduler. )
[2019-07-10 1415:3554:3753] INFO   : [scheduler-master] [ MongodbRetention ] [ LOADDELETE OLD RETENTION   ] We will delete all  ] SUCCESS Retention data loaded successfully in 0.662s.

Erreurs

Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:

Code Block
titleExemple
retention data that were saved before the 2019-07-07 13:54 UTC (3 days)
[2019-07-10 1615:1954:1053] ERRORINFO   : [scheduler-master] [ MongodbRetention ] [ DELETE LOADOLD RETENTION         ] The mongodb connection failed to initialise. We cannot load the retention data

Suppression des anciennes rétentions

Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.

Code Block
titleExemple avec des objets à supprimer
- Deleting 994 hosts from old retention [1000 by 1000]
[2019-07-10 15:54:53] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ] We will delete all retention data that were saved before the 2019-07-07 13:54 UTC (3 days)- 994  - hosts deleted in 0.188s
[2019-07-10 15:54:53] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - Deleting 994 hostsservices from old retention [1000 by 1000]
[2019-07-10 15:54:53] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - 994  - hostsservices deleted in 0.188s091s
[2019-07-10 15:54:53] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total -time Deletingfor 994deleting services1988 fromentries old retention [1000 by 1000]
= 0.280s


Code Block
titleExemple sans objets à supprimer
[2019-07-10 1514:5435:5313] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ]  - 994  - services deleted in 0.091sWe will delete all retention data that were saved before the 2019-07-07 12:35 UTC (3 days)
[2019-07-10 1514:5435:5316] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total time] forThere deletingis 1988no entriesdata =to 0.280s
Code Block
titleExemple sans objets à supprimer
delete
[2019-07-10 14:35:1316] INFO   : [scheduler-master] [ MongodbRetention ] [ DELETE OLD RETENTION   ] Total time Wefor willdeleting delete0 allentries retention data that were saved before the 2019-07-07 12:35 UTC (3 days)
= 0.019s


Erreurs lors de la suppression d'anciennes données ou lors de la sauvegarde

Si une erreur est rencontrée lors de la suppression, elles seront indiquées dans les logs, comme ceux-ci:

Code Block
titleExemple
[2019-07-10 1416:3519:1610] INFO   WARNING: [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETESOUS-SECTION OLD] RETENTIONWe have been ]disconnected Thereof ismongo. noWill data to deleteretry [1/3]
[2019-07-10 1416:3519:1610] INFO   WARNING: [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETESOUS-SECTION OLD] RETENTIONWe have been ]disconnected Total time for deleting 0 entries = 0.019s

Erreurs

Si une erreur est rencontrée lors de la suppression, elles seront indiquées dans les logs, comme ceux-ci:

Code Block
titleExemple
of mongo. Will retry [2/3]
[2019-07-10 16:19:10] ERROR  WARNING: [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION  SOUS-SECTION ] We have been disconnected of mongo. Will retry [13/3]
[2019-07-10 16:19:10] ERROR  : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETESOUS-SECTION OLD] RETENTIONAfter 3 tries, ]we Wecouldn't haveconnect beento disconnected of mongo. Will retry [2/3]


Code Block
[2019-07-10 16:19:10] ERROR  : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETE OLD RETENTION   SOUS-SECTION ] We have been disconnected of mongo. Will retry [3/3an error:[ERROR MESSAGE]
[2019-07-10 16:19:10] ERROR  : [scheduler-masterSCHEDULERNAME] [ MongodbRetention ] [ DELETESOUS-SECTION OLD] RETENTION(stack du Traceback)
[2019-07-10 16:19:10] After 3 tries, we couldn't connect to mongo
ERROR  : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] ...