| Scroll Ignore |
|---|
| scroll-viewport | true |
|---|
| scroll-pdf | true |
|---|
| scroll-office | true |
|---|
| scroll-chm | true |
|---|
| scroll-docbook | true |
|---|
| scroll-eclipsehelp | true |
|---|
| scroll-epub | true |
|---|
| scroll-html | truefalse |
|---|
|
|
Sur réception du signal SIGUSR1 le module va effectuer un dump de sa mémoire, pour tout autre signal, le module va s'éteindre arrêter :
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ MANAGE SIGNAL ] The worker with the pid XXXX received a signal XX |
Quand le processus de pilotage s'arrête de façon inopinée :
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER:XXXX ] I am a worker with pid: XXXX and my master process YYYY is dead, I exit. |
Demande d'un dump de la mémoire
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP (to be sent to the support):
xxxxxxxx
xxxxxxxx
xxxxxxxx |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) Memory information dumped to file FFFFFFF (to be sent to the support) |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] MEMORY DUMP: FAIL check if guppy lib is installed |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ WORKER pid=XXXX ] (support-only) MEMORY DUMP: FAIL check if meliae lib is installed |
Connexion à la base de données
Pour la connexion à la base de données,
trois quatre SOUS-SECTIONSECTIONS existent :
| Section | Description |
|---|
| LOAD RETENTION | Correspond au chargement de la rétention |
| DELETE OLD RETENTION | Correspond à la suppression des anciennes rétentions |
| SAVE |
WORKER XXXX aux workers utilisés pour en rétention |
RETENTION STATUS | Correspond à l'étape de vérification de l'état de la rétention, avant son chargement |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Try Weto areopen creatinga mongoMongodb connection to [uri= mongodb://192.168.1.120/?w=1&safe=false ] database [database= shinken] [ssh=True]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONINITIALISATION ] [ ConnectionMONGO ] created[ inSSH : 0.200s
|
Il y indique :
- L'URL utilisée
- La base de données (peut être différente du défaut "shinken" comme ici)
- Si un tunnel SSH va être utilisé ou pas
- Le temps prit pour se connecter à la base mongo
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTION ] Mongo connection failed 1/X time, we will try againTUNNEL ] Connection to mongodb://192.168.1.120/?w=1&fsync=false with a ssh tunnel:
[YYYY-MM-DD HH:MM:SS] INFO WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONINITIALISATION ] [ MONGO Mongo] connection[ failedSSH Y/X times, we will try again
[YYYY-TUNNEL ] - searching a random local port available for the tunnel binding (trying 15978): localhost:15978 =(ssh tunnel)=>
bastdev2:22 =(mongodb)=> 192.168.1.120:27017 (search try:1)
[YYYY-MM-DD HH:MM:SS] INFO ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SOUS-SECTIONINITIALISATION ] Mongo[ connectionMONGO failed] X/X times, we stop trying |
La connexion a été perdue ou n'existe pas
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO[ SSH TUNNEL ] : [ SCHEDULERNAME- ]tunnel [creation MongodbRetention ] [ SOUS-SECTION ] We need to create a mongo connection |
La connexion n'a pas pu être établie
| Code Block |
|---|
SUCCESS: localhost:15978 =(ssh tunnel)=> 192.168.1.120:22 =(mongodb)=> 192.168.1.120:27017 (search try:1, ssh pid=22096)
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ INITIALISATION ] [ MONGO ] [ SSH SOUS-SECTIONTUNNEL ] Could not- createSUCCESS mongo connection |
Erreur de configuration du module
Si plusieurs url mongo sont précisées
| Code Block |
|---|
is OPENED with the SSH tunnel: localhost:15978 =(ssh tunnel)=> 192.168.1.120:22 =(mongodb)=> 192.168.1.120:27017
[YYYY-MM-DD HH:MM:SS] ERRORINFO : [ SCHEDULERNAME ] [ MODULES-MANAGERMongodbRetention ] The[ instanceSOUS-SECTION MongodbRetention] raisedMongo an error: Multiples urls were foundconnection established in the module's configuration file. I disable it and set it to restart it later |
Pour la sauvegarde de la rétention, trois SOUS-SECTION existent:
| Section | Description |
|---|
| SAVE GLOBAL | Correspond au processus global de la sauvegarde |
| SAVE WORKERS | Correspond à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde |
| SAVE WORKER X | C'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir Rétention en base de données centralisée par royaume ( Module MongodbRetention ) ) |
Il y indique :
- L'URL utilisée
- La base de données (peut être différente du défaut "shinken" comme ici)
- Si un tunnel SSH est utilisé, les ports utilisés pour la redirection du trafic
- Le temps prit pour se connecter à la base mongo
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY |
Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.
Avant de faire la rétention, le module nous informe de l'URI utilisé ainsi que du nombre total d'hôtes et de checks à sauvegarder.
| Code Block |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVEINITIALISATION ] GLOBAL[ MONGO ] [ SSH TUNNEL ] StartingConnection to savemongo retentionfailed, dataclosing withthe X worker(s). [YY:hosts] [ZZ:checks] (Database used = mongodb://127.0.0.1/?safe=false, use ssh = True/False)
|
Dans l'exemple :
- X : Le nombre de workers lancés en parallèle pour effectuer la sauvegarde.
- YY : Le nombre d'hôtes et clusters qui vont être sauvegardés.
- ZZ : Le nombre de checks qui vont être sauvegardés.
| Code Block |
|---|
|
SSH tunnel
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ MONGO ] Mongo raised ERROR_MESSAGE on the operation get_connection. Operation failed : 1/5
...
[YYYY-MM-DD HH:MM:SS] INFO WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBALINITIALISATION ] SUCCESS[ RetentionMONGO data] was[ savedSSH intoTUNNEL mongodb.] TotalConnection time X.XXs |
Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
to mongo failed, closing the SSH tunnel
[YYYY-MM-DD HH:MM:SS] ERRORINFO : [ SCHEDULERNAME ] [ MODULES-MANAGERMongodbRetention ] [ TheMONGO instance] MongodbRetentionMongo raised an error: ERROR_MESSAGE MESSAGE.on Totalthe time XX.XXsoperation get_connection. IOperation disablefailed it and set it to restart it later
|
| Code Block |
|---|
: 2/5
...
[YYYY-MM-DD HH:MM:SS] ERROR WARNING: [ SCHEDULERNAME ] [ MODULES-MANAGERMongodbRetention ] The[ instanceINITIALISATION MongodbRetention] raised[ anMONGO error:] [ SAVESSH GLOBALTUNNEL ] FAILEDConnection Retentionto datamongo couldfailed, notclosing bethe saved in mongodb. Total time 22.20s. I disable it and set it to restart it later |
| Code Block |
|---|
SSH tunnel
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ MONGO ] Mongo raised ERROR_MESSAGE on the operation get_connection. Operation failed : 3/5
...
[YYYY-MM-DD HH:MM:SS] ERROR WARNING: [ SCHEDULERNAME ] [ MODULES-MANAGERMongodbRetention ] The[ instanceINITIALISATION MongodbRetention] raised[ anMONGO error:] [ SAVESSH GLOBALTUNNEL ] FAILEDConnection Retentionto datamongo couldfailed, notclosing bethe saved in mongodb because mongo is unreachable. Total time 2.11s. I disable it and set it to restart it later |
Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block |
|---|
|
SSH tunnel
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ MONGO ] Mongo raised ERROR_MESSAGE on the operation get_connection. Operation failed : 4/5
...
[YYYY-MM-DD HH:MM:SS] INFO WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVEINITIALISATION WORKERS] ][ StartingMONGO worker] X[ withSSH pidTUNNEL XXXX. Try: X/X] Connection to mongo failed, closing the SSH tunnel
[YYYY-MM-DD HH:MM:SS] INFOERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERSMONGO ] TheMongo workerraised XERROR_MESSAGE didon SUCCESSthe (after X try) |
La préparation des données à sauvegarder a été longue :
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ PERF ] [ X.XXXs ] atomization duration |
Des erreurs empêchent le bon déroulé de la sauvegarde :
operation get_connection. Operation failed : 5/5. We tried 5 times but it kept failing. |
La connexion n'a pas pu être établie
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
code |
[YYYY-MM-DD HH:MM:SS] INFOERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVELOAD WORKERSRETENTION ] someLOAD workersRETENTION did] failFAILED toRetention exitcould or encountered an error. The retention save can be incomplete.
not be loaded from mongodb: Mongo raised ERROR_MESSAGE on the operation get_connection. Operation failed : 5/5. We tried 5 times but it kept failing. |
Erreur de configuration du module
Si plusieurs url mongo sont précisées
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
code |
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetentionMODULES-MANAGER ] [The SAVEinstance WORKERSMongodbRetention ]raised Too many tries failedan error: Multiples urls were found in the module's configuration file. I disable it and set it to restart it later |
Pour la sauvegarde de la rétention, trois SOUS-SECTION existent:
| Section | Description |
|---|
| SAVE GLOBAL | Correspond au processus global de la sauvegarde |
| SAVE WORKERS | Correspond à un sous-processus de SAVE GLOBAL, qui s'occupe de la file d'attente des différents workers de la sauvegarde |
| SAVE WORKER X | C'est un sous-processus de SAVE WORKERS, correspondant à un worker numéroté X qui permet de sauvegarder une partie des informations du scheduler en base. Le nombre de workers est paramétrable dans les paramètres du module. ( voir Module MongodbRetention ( Rétention en base de données centralisée par royaume ) ) |
Les logs SAVE GLOBAL donnent des informations relatives au fonctionnement global du module ou de sa configuration.
Avant de faire la rétention, le module informe de l'URI utilisé ainsi que du nombre total d'hôtes et de checks à sauvegarder.
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[ |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the XXXX worker process as there is not enough memory |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the worker XXXX process: XX. Exiting the retention save, killing all currently launched workers |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] ERROR MESSAGE
[YYYY-MM-DD HH:MM:SS] ERRORINFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE GLOBAL WORKERS ] "EXCEPTION PYTHON"
|
Les logs SAVE WORKER X donne pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées.
| Code Block |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME Starting to save retention with VV worker(s). [ XX:hosts/clusters ] [ MongodbRetentionYY:checks ] [( SAVEDatabase WORKERused X ] Updating retention with elements: checks [ XXX ] -- hosts [ XX ] in mongodb
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME= mongodb://127.0.0.1safe=false, use ssh = 0 ), max time allowed for the save ZZ seconds
|
Dans l'exemple :
- VV : Le nombre de workers lancés en parallèle pour effectuer la sauvegarde.
- XX : Le nombre d'hôtes et clusters qui vont être sauvegardés.
- YY : Le nombre de checks qui vont être sauvegardés.
- ZZ : Le temps défini pour que la sauvegarde de la rétention se réalise
| Code Block |
|---|
| language | text |
|---|
| theme | Emacs |
|---|
| title | Exemple |
|---|
|
[2025-02-11 09:53:59] INFO : [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKERGLOBAL X ] Starting Retentionto datasave savedretention intowith mongodb in X.XXX seconds |
Nous sommes donc informés de :
- Du nombre d'éléments mis à jour dans la base ( nombre de checks et nombre d'hôtes )
- Le temps que la sauvegarde a pris
4 worker(s). [ 10:hosts/clusters ] [ 100:checks ] ( Database used = mongodb://192.168.1.56/?w=1&fsync=false, use ssh = 1 ), max time allowed for the save 120 seconds |
Les erreurs lors de la sauvegarde de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] The worker (pid:XXXX | try:XX) did not exit on time (XX s). We are restarting it. |
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetentionMODULES-MANAGER ] [ SAVE WORKER X ] Failed connection with the following message : ERROR MESSAGE |
Perte de connexion à la base de données
The instance MongodbRetention raised an error: ERROR MESSAGE. Total time XX.XXs. I disable it and set it to restart it later |
| Code Block |
|---|
|
[2025-02-11 09:56:50] ERROR : [ scheduler-master ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention could not be saved in mongodb. Total time 194.80s. I disable it and set it to restart it later |
| Code Block |
|---|
|
[2025-02-11 09:56:50] ERROR : [ scheduler-master ] [ MODULES-MANAGER ] The instance MongodbRetention raised an error: [ SAVE GLOBAL ] FAILED Retention could not be saved in mongodb because mongo is unreachable. Total time 194.80s. I disable it and set it to restart it later |
Les logs SAVE WORKERS donnent l'état de chaque worker de sa création à son succès/échec.
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [1/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [Y/X]
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] worker has been disconnected of mongo. Will retry [X/X]
[YYYY-MM-DD HH:MM:SS] ERRORINFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERWORKERS X ] After X tries,Starting worker couldX notwith connectpid toXXXX. mongoTry: :[ERROR MESSAGE Y ], max time allowed [ ZZs ]
[YYYY-MM-DD HH:MM:SS] INFO ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER WORKERS ] The worker X ]successfully ended (pid=XXXX) "EXCEPTION PYTHON" |
| Code Block |
|---|
after Y tries ) |
La préparation des données à sauvegarder a été longue :
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] [ PERF ] [ X.XXXs ] atomization duration |
Des erreurs empêchent le bon déroulé de la sauvegarde :
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] some workers did fail to exit or encountered an error. The retention save can be incomplete.
|
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Too many tries failed |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the XXXX worker process as there is not enough memory |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] Cannot start the worker XXXX process: XX. Exiting the retention save, killing all currently launched workers |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] ERROR MESSAGE
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKERS ] "EXCEPTION PYTHON"
|
Les logs SAVE WORKER X donnent pour le worker ayant l'identifiant X, les statistiques sur les sauvegardes qu'il a effectuées.
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
| title | Exemple |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Preparing elements to save
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Took X.XXms to prepare XXX hosts/clusters and XXXX checks
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Took X.XXms to connect to Mongo
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] hosts/clusters will be saved in groups of maximum 1000
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Saved XXX/XXX hosts/clusters ( took X.XXms )
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Took X.XXms to save XXX hosts/clusters
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] checks will be saved in groups of maximum 1000
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Saved XXXX/XXXX checks ( took X.XXms )
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Saved XXXX/XXXX checks ( took X.XXms )
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Took X.XXms to save XXXX checks
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Worker ended in X.XXms |
Informations :
- Du démarrage du worker
- Du temps que le worker met a préparer les éléments ( sélection, sérialisation)
- Du temps prit pour se connecter à la base Mongo
- De la taille des groupes d'éléments sauvegardés
- De l'avancement de chaque groupe et du temps prit
- Du temps total pris par le worker
Perte de connexion à la base de données
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ MONGO ] Mongo raised ( Mongo connection failure to xxxxxxx ) on the operation get_connection. Operation failed : 1/5
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ MONGO ] Mongo raised ( Mongo connection failure to xxxxxxx ) on the operation get_connection. Operation failed : 2/5
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ MONGO ] Mongo raised ( Mongo connection failure to xxxxxxx ) on the operation get_connection. Operation failed : 3/5
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ MONGO ] Mongo raised ( Mongo connection failure to xxxxxxx ) on the operation get_connection. Operation failed : 4/5
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ MONGO ] Mongo raised ( Mongo connection failure to xxxxxxx ) on the operation get_connection. Operation failed : 5/5. We tried 5 times but it kept failing.
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] After 5 tries, worker could not connect to mongo :[Mongo raised ( Mongo connection failure to xxxxxxx ) on the operation get_connection. Operation failed : 5/5. We tried 5 times but it kept failing.]
|
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] Worker has an error:[ ERROR MESSAGE ]
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] (pid=XXXX) "EXCEPTION PYTHON" |
OVERSIZED DATA - Détection d'éléments avec une taille anormale
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] LOG_LEVEL: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ OVERSIZED DATA ] [ DETAILS ] oversized data of XXXXB for ELEMENT_TYPE ELEMENT_UUID may cause database query to fail. Detail of potential expensive content: ELEMENT_DETAILS
[YYYY-MM-DD HH:MM:SS] LOG_LEVEL: [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X ] [ OVERSIZED DATA ] [ SIZE ] oversized data of XXXXB for ELEMENT_TYPE ELEMENT_UUID may cause database query to fail. Size of potential expensive content: ELEMENT_SIZE_DETAILS |
La sauvegarde de la rétention peut échouer si au moins un élément dépasse la taille maximale que peut supporter la base de données. Le module va afficher les éléments pouvant causer cette erreur suivant des seuils définis dans sa configuration.
| Paramètre du module | Niveau de log |
|---|
| No Format |
|---|
scheduler__retention_mongo__oversized_element_warning_threshold__size |
| WARNING |
| No Format |
|---|
scheduler__retention_mongo__oversized_element_error_threshold__size |
| ERROR |
| Code Block |
|---|
| language | text |
|---|
| theme | Emacs |
|---|
| title | Exemple |
|---|
|
[2025-07-23 10:29:45] WARNING: [ scheduler-master[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ SAVE WORKER X0 ] Worker has an error: [ ERROR MESSAGE ]
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME[ OVERSIZED DATA ] [ MongodbRetentionDETAILS ] [oversized SAVEdata WORKERof X12845B ] (pid=XXXX) "EXCEPTION PYTHON" |
Chargement de la rétention
for service 80e69ea445e111f0abb10800270aacd1-97373c2245e111f080950800270aacd1 may cause database query to fail. Detail of potential expensive content: total notifications nb:141384, notified contacts uuid list nb:1, incident nb:1, notifications in progress nb:0, downtimes nb:0, checks in progress nb:0
[2025-07-23 10:29:45] WARNING: [ scheduler-master ] [ MongodbRetention ] [ SAVE WORKER 0 ] [ OVERSIZED DATA ] [ SIZE ] oversized data of 12845B for service 80e69ea445e111f0abb10800270aacd1-97373c2245e111f080950800270aacd1 may cause database query to fail. Size of potential expensive content: outputs size:167B, current and last perf data size:98B, downtimes user content size:0B, acknowledgement user content size:0B |
Chargement de la rétention
Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
Les logs fournissent des informations liées au chargement de la rétention, permettant de suivre son avancée et l'état sur la connexion à Mongo.
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] [ X.XXXs ] We took X hosts/clusters from the retention [ in scheduler hosts/clusters : without retention=X / total=1 ]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ HOSTS / CLUSTERS ] No host/cluster are needed for retention load (scheduler already have all X hosts retention data).
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] Try to open a Mongodb connection to [ CHECKS mongodb://127.0.0.1/?safe=false ] database [ shinken ]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ X.XXXsMongodbRetention ] We[ tookLOAD XRETENTION ] Mongo connection checksestablished in 4.94ms
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ fromMongodbRetention the retention] [ inLOAD schedulerRETENTION checks] [ HOSTS/CLUSTERS ] Scheduler has XXX/XXX hosts/clusters in :its withoutcache retention=XXand / total=XX ]need load retention for XXX/XXX
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ CHECKS HOSTS/CLUSTERS ] NoTook checks are needed for retention3.52ms to load (scheduler already have all X checks retention data).XX/XX hosts/clusters
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs] [ TotalCHECKS number of elements load from mongo database: X ] Scheduler has (YYY/YYY schedulerchecks havein aits totalcache ofand XXneed load retention elementsfor )YYY/YYY
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] [ X.XXXs ] SUCCESS Retention data loaded successfully. |
Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
CHECKS ] Took 28.00ms to load YYY/YYY checks
[YYYY-MM-DD HH:MM:SS] ERRORINFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] FAILEDTook Retention32.07ms datato couldload not be loaded from mongodb: ERROR MESSAGE DETAILS
|
| Code Block |
|---|
ZZZ/ZZZ elements
[YYYY-MM-DD HH:MM:SS] ERRORINFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying host entries: ERROR MESSAGE. Module exiting. Took 5.99ms to restore data to Scheduler |
Les erreurs lors du chargement de la rétention sont aussi enregistrées dans les logs sous cette forme:
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
| Code Block |
|---|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] errorFAILED queryingRetention checkscould entries:not ERRORbe MESSAGE.loaded Module exiting. |
Suppression des anciennes rétentions
from mongodb: ERROR MESSAGE DETAILS
|
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.
| Code Block |
|---|
| title | Exemple avec des objets à supprimer |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETELOAD OLD RETENTION ] Weerror willquerying delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)
hosts/clusters entries: ERROR MESSAGE. Module exiting. |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ LOAD RETENTION ] error querying checks entries: ERROR MESSAGE. Module exiting. |
Suppression des anciennes rétentions
Les logs de suppression permettent de voir le nombre d'objets supprimés (triés par hôtes et checks) ainsi que la date à partir de laquelle la rétention est conservée.
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
| title | Exemple avec des objets à supprimer |
|---|
|
DELETE OLD RETENTION ] - Deleting XXX hosts from old retention [XXXX by XXXX]
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] Checking old elements ( hosts/clusters/checks ) ]not updated -since XXX7 days - hosts deleted in X.XXXs> YYYY-MM-DD HH:MM UTC
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - Deleting XXX serviceshosts/clusters fromdeleted old retention [XXXX by XXXX]in 377.65ms
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - XXX - servicesYYY checks deleted in X184.XXXs476ms
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] ] Total time for deleting X XXXXold entrieselements = X.XXXs
562.126ms |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
code | | title | Exemple sans objets à supprimer |
|---|
|
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] ]Checking Weold willelements delete all retention data that were saved before the XXXX-XX-XX XX:XX UTC (X days)( hosts/clusters/checks ) not updated since 7 days -> YYYY-MM-DD HH:MM UTC
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] - There is no data to delete
[YYYY-MM-DD HH:MM:SS] INFO : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] Total time for deleting 0X old entrieselements = X1.XXXs
17ms |
Erreur : perte de connexion à la base de données
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [1/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [2/3]
[YYYY-MM-DD HH:MM:SS] WARNING: [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have been disconnected of mongo. Will retry [3/3]
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] After 3 tries, we couldn't connect to mongo |
| Code Block |
|---|
| language | js |
|---|
| theme | Confluence |
|---|
|
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] We have an error:[ERROR MESSAGE]
[YYYY-MM-DD HH:MM:SS] ERROR : [ SCHEDULERNAME ] [ MongodbRetention ] [ DELETE OLD RETENTION ] "EXCEPTION PYTHON" |