| Scroll Ignore | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||
|
Description
Les logs de la rétention Mongodb MongoDB du Scheduler sont classés par catégorie afin de pouvoir différencier les types de log :
- Gestion du module
- Connexion à la base de données
- Sauvegarde
- Chargement
- La suppression Suppression des lignes retentions de rétention obsolètes.
Gestion du module
Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONMANAGE SIGNAL ] The worker with the pid XXXX received a signal XX |
Arrêt critique
Quand le processus de pilotage s'arrête de façon inopinée
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS |
| Code Block |
[2021-04-21 10:24:49] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONWORKER:XXXXX ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit. |
Demande d'un dump de la mémoire
Le dump est fait
Python 2.6
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION WORKER pid=XXXX ] (support-only) MEMORY DUMP (to be sent to the support): xxxxxxxx xxxxxxxx xxxxxxxx |
Python 2.7
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONWORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support) |
Le dump a échoué
Python 2.6
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONWORKER pid=XXXX ] MEMORY DUMP: FAIL check if guppy lib is installed |
Python 2.7
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION WORKER pid=XXXX ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed |
Connexion à la base de données
Dans les logs suivants, le mot clé SOUS-SECTION peut valoir une des valeurs suivantes :
- LOAD RETENTION
- DELETE OLD RETENTION
- SAVE WORKER XXXXX
Connexion normale
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We are creating mongo connection [uri=mongodb://192.168.1.120/?safe=false] [database=shinken] [ssh=True] [2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Connection created in : 0.200s |
La connexion échoue
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try again [2021YYYY-04MM-21DD 10HH:24MM:49SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed Y/X times, we will try again [2021YYYY-04MM-21DD 10HH:24MM:49SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed X/X times, we stop trying |
La connexion a été perdue ou n'existe pas
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection |
suivi des logs de la connexion normale
La connexion n'a pas pu être établie
| Code Block |
|---|
[2021YYYY-04MM-21DD 10HH:24MM:49SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Could not create mongo connection |
Erreur de configuration du module
Si plusieurs url mongo sont préciséprécisées
| Code Block |
|---|
[2021YYYY-04MM-20DD 13HH:52MM:26SS] ERROR : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later |
Sauvegarde en rétention
Pour la sauvegarde de la rétention, trois types de logs existent:
| Section | Description |
|---|---|
| SAVE GLOBAL | Correspond au processus global de la sauvegarde |
| SAVE WORKERS | Corresponds à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde |
| SAVE WORKER X | C'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du Scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir ( voir la page Module MongodbRetention ( Rétention en base de donnée centralisée par royaume ( Module MongodbRetention ) ) |
SAVE GLOBAL
Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 14HH:34MM:39SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] Starting to save retention data. [XXX:hosts] [XXX:checks] (Database used = mongodb://HOST/?safe=false, use ssh = False) [2019YYYY-07MM-10DD 14HH:34MM:39SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL ] SUCCESS Retention data was saved into mongodb. Total time X.XXs |
Erreurs
Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[2021YYYY-04MM-20DD 11HH:26MM:57SS] ERROR : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodbERROR MESSAGE. Total time 22XX.20sXXs. I disable it and set it to restart it later |
Exemples
| Code Block | |
|---|---|
|
Lors de la sauvegarde de la rétention, la base mongo est injoignable
| Code Block | ||
|---|---|---|
| ||
[2021YYYY-04MM-20DD 11HH:26MM:57SS] ERROR : [ scheduler-masterSCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 222.11s20s. I disable it and set it to restart it later |
SAVE WORKERS
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention data could not be saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later |
SAVE WORKERS
Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block | ||
|---|---|---|
| ||
[YYYY-MM-DD HH:MM:SS |
Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block | ||
|---|---|---|
| ||
[2019-07-10 14:34:44] INFO : [scheduler-master] [ MongodbRetention ] [ SAVE WORKERS ] Starting worker 0 with pid 14746. Try: 1/3 [2019-07-10 14:34:54] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Starting worker X with pid XXXXX. Try: X/X [YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] The worker X did SUCCESS (after X try) |
La préparation des données à sauvegarder a été longue :
| Code Block |
|---|
[2019YYYY-07MM-10DD 14HH:34MM:44SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration |
Des erreurs empêchent le bon déroulé de la sauvegarde :
| Code Block |
|---|
YYYY-MM-DD HH:MM:SS |
| Code Block |
[2019-07-10 14:34:44] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] some workers did fail to exit or encountered an error. The retention save can be incomplete. |
| Code Block |
|---|
[2019YYYY-07MM-10DD 14HH:34MM:44SS] ERROR : Too[ manySCHEDULERNAME tries] failed [ MongodbRetention ] [ SAVE WORKERS ] Too many tries failed |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS2019-07-10 14:34:44] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the XXXXX worker process as there is not enough memory |
| Code Block |
|---|
[2019YYYY-07MM-10DD 14HH:34MM:44SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the worker X process: XX. Exiting the retention save, killing all currently launched workers |
SAVE WORKER X
Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées : le nombre d'éléments, résultat et temps d'exécution.
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb [2019YYYY-07MM-10DD 14HH:34MM:44SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0 ] Retention data saved into mongodb in X.XXX seconds [2019-07-10 14:34:44] INFO : [scheduler-master |
Erreurs
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0X ] The worker (pid:XXXX | try:XX) did not exit ]on Willtime save 249 hosts and 249 checks [2019-07-10 (XX s). We are restarting it. |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME 14:34:54] INFO : [scheduler-master] [ MongodbRetention ] [ SAVE WORKER 0 X ] SUCCESSFailed didconnection savedwith 249the hostsfollowing andmessage 249: checks retention data into mongodb in 10.46sERROR MESSAGE |
Perte de connexion à la base de données
Erreurs| Code Block |
|---|
[2019YYYY-07MM-10DD 14HH:34MM:44SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [1/X] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it. |
En cas d'erreur, chaque worker essaiera de se lancer à nouveau en respectant le nombre de tentatives maximales définies dans le fichier de configuration du module. Si la rétention n'est pas sauvegardée après ces tentatives, le module sera en échec et le Scheduler s'arrêtera.
Lorsque le SSH tunnel est activé :
| Code Block | ||
|---|---|---|
| ||
[2021-04-20 11:43:09] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] Mongo connection failed 1/1 time, we will try again [2021-04-20 11:43:10] ERROR : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] Mongo connection failed 1/1 times, we stop trying [2021-04-20 13:24:20] ERROR : [ scheduler-master [YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [Y/X] [YYYY-MM-DD HH:MM:SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER 0 X ] Workerworker has anbeen error: [ Mongo connection failure : localhost:34925 ===(ssh tunnel)===> 192.168.1.132:22 ===(mongodb)===> 192.168.1.132:27017 ] |
Sans le SSH tunnel
| Code Block | ||
|---|---|---|
| ||
[2021-04-20 11:43:09] WARNING: [ scheduler-master disconnected of mongo. Will retry [X/X] [YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER 0X ] After X ]tries, Mongoworker connectioncould failednot 1/1 time, we will try again [2021-04-20 11:43:10connect to mongo :[ERROR MESSAGE] [YYYY-MM-DD HH:MM:SS] ERROR : [ scheduler-master SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER 0X ] Mongo connection failed 1/1 times, we stop trying [2021-04-20 12:06:15 (pid=XXXX) "EXCEPTION PYTHON" |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ scheduler-master SCHEDULERNAME] [ MongodbRetention ] [ SAVE WORKER 0X ] Worker has an error: [ MongoERROR connection failure to mongodb://192.168.1.132/?safe=false ] |
| Code Block |
[2021-04-20 12:06:15MESSAGE ] [YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER 0X ] Failed connection with the following message : ERROR MESSAGE(pid=XXXX) "EXCEPTION PYTHON" |
Chargement de la rétention
Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.
| Code Block |
|---|
[2019YYYY-07MM-10DD 16HH:19MM:10SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X hosts/clusters from the retention [ in scheduler hosts/clusters : without retention=X / total=1 ] [2019YYYY-07MM-10DD 16HH:19MM:10SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X hosts retention data). [2019YYYY-07MM-10DD 16HH:19MM:10SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS ] [ X.XXXs ] We took X checks from the retention [ in scheduler checks : without retention=XX / total=XX ] [2019YYYY-07MM-10DD 16HH:19MM:10SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS ] No checks are needed for retention load (scheduler already have all X checks retention data). [2019YYYY-07MM-10DD 16HH:19MM:10SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs] Total number of elements load from mongo database: X ( scheduler have a total of XX elements ) [2019YYYY-07MM-10DD 14HH:35MM:37SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] SUCCESS Retention data loaded successfully. |
Erreurs
Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] FAILED Retention data could not be loaded from mongodb: ERROR MESSAGE DETAILS |
| Code Block |
|---|
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying host entries: ERROR MESSAGE. Module exiting. |
| Code Block |
|---|
[2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying checks entries: ERROR MESSAGE. Module exiting. |
Suppression des anciennes rétentions
Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We will delete all retention data that were saved before the 2019XXXX-07XX-07XX 13XX:54XX UTC (3X days) [2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting 994XXX hosts from old retention [1000XXXX by 1000XXXX] [2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - 994XXX - hosts deleted in 0X.188sXXXs [2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting 994XXX services from old retention [1000XXXX by 1000XXXX] [2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - 994XXX - services deleted in 0X.091sXXXs [2019YYYY-07MM-10DD 15HH:54MM:53SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] Total time for deleting 1988XXXX entries = 0X.280sXXXs |
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 14HH:35MM:13SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We will delete all retention data that were saved before the 2019XXXX-07XX-07XX 12XX:35XX UTC (3X days) [2019YYYY-07MM-10DD 14HH:35MM:16SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] There is no data to delete [2019YYYY-07MM-10DD 14HH:35MM:16SS] INFO : [scheduler-master SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] Total time for deleting 0 entries = 0X.019sXXXs |
Erreur : perte de connexion à la base de données
Si une erreur est rencontrée lors de la suppression, elles seront indiquées dans les logs, comme ceux-cisurvient pendant une opération en base de données, les logs suivants vont apparaître :
| Code Block | ||
|---|---|---|
| ||
[2019YYYY-07MM-10DD 16HH:19MM:10SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTIONDELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [1/3] [2019YYYY-07MM-10DD 16HH:19MM:10SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTIONDELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [2/3] [2019YYYY-07MM-10DD 16HH:19MM:10SS] WARNING: [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTIONDELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [3/3] [2019YYYY-07MM-10DD 16HH:19MM:10SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION DELETE OLD RETENTION ] After 3 tries, we couldn't connect to mongo mongo |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS |
| Code Block |
[2019-07-10 16:19:10] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTION ] We have an error:[ERROR MESSAGE] [2019-07-10 16:19:10] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTIONDELETE OLD RETENTION ] (stack du Traceback) [2019-07-10 16:19:10 We have an error:[ERROR MESSAGE] [YYYY-MM-DD HH:MM:SS] ERROR : [SCHEDULERNAME] [ MongodbRetention ] [ SOUS-SECTIONDELETE OLD RETENTION ] ..."Exception Python" |