Abstract
This document describes how you can monitor an Oracle database server such as:
- Connection time
- A recent restart
- The number of connections
- Cache hit
- Dead lock
- etc
There are some steps you’ll need to follow in order to monitor a new database machine:
- Understand what is already available (in the Shinken Entreprise installation)
- Setup the oracle user account
- Test the connection to the database
- Setup your server host definition
| Panel | ||||
|---|---|---|---|---|
| ||||
|
What is already available in the Shinken Installation
To make your life a bit easier, a few configuration tasks have already been done for you:
- Installation of Oracle instant Client
- Installation of check_oracle_health plugin : /var/lib/shinken/libexec/check_oracle_health
- Several host templates are ready to be used
| Tip | ||
|---|---|---|
| ||
| We suppose here that the Oracle machine you want to monitor is named srv-lin-1 and is a linux. Please change the configuration and commands according with the real name of your server. |
Setup the oracle user account
Create the database user
| Tip | ||
|---|---|---|
| ||
You will need to configure the user for all your oracle databases. |
Connect to your database as sysadmin on the oracle server :
| Code Block |
|---|
srv-lin-1:oracle# sqlplus "/ as sysdba" |
And then create your shinken account on the database:
| Code Block | ||
|---|---|---|
| ||
CREATE USER shinken IDENTIFIED BY shinkenpassword; GRANT CREATE SESSION TO shinken; GRANT SELECT any dictionary TO shinken; GRANT SELECT ON V_$SYSSTAT TO shinken; GRANT SELECT ON V_$INSTANCE TO shinken; GRANT SELECT ON V_$LOG TO shinken; GRANT SELECT ON SYS.DBA_DATA_FILES TO shinken; GRANT SELECT ON SYS.DBA_FREE_SPACE TO shinken; |
And for old 8.1.7 database only:
| Code Block | ||
|---|---|---|
| ||
---- if somebody still uses Oracle 8.1.7... GRANT SELECT ON sys.dba_tablespaces TO shinken; GRANT SELECT ON dba_temp_files TO shinken; GRANT SELECT ON sys.v_$Temp_extent_pool TO shinken; GRANT SELECT ON sys.v_$TEMP_SPACE_HEADER TO shinken; GRANT SELECT ON sys.v_$session TO shinken; |
What is already done in the Shinken Installation
To make your life a bit easier, a few configuration tasks have already been done for you:
- Installation of Oracle instant Client
- Installation of check_oracle_health plugin : /var/lib/shinken/libexec/check_oracle_health
- Creation of several host templates
| Tip | ||
|---|---|---|
| ||
| We suppose here that the Oracle machine you want to monitor is named srv-lin-1 and is a linux. Please change the configuration and commands according with the real name of your server. |
Test the connection
To see if the connection to the database named PROD is ok, just launch :
| Code Block | ||
|---|---|---|
| ||
/var/lib/shinken/libexec/check_oracle_health --connect "(DESCRIPTION =(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = srv-lin-1)(PORT = 1521)))(CONNECT_DATA =(SID = PROD)))" --user "shinken" --password "shinkenpassword" --mode connection-time |
What is checked with the templates
Public templates
Oracle
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| tnsping | listener | |||
| process-usage | Percentage of max possible processes | 0% to 100% | 80 | 90 |
| sga-shared-pool-free | Free Memory in the Shared Pool | 0% to 100% | 10: | 5: |
| tablespace-usage | Used diskspace in the tablespace | 0% to 100% | 90 | 98 |
| corrupted-blocks | Number of corrupted blocks in database | 0 to n | 1 | 10 |
| invalid-objects | Sum of faulty Objects, Indices, Partitions | 0 to n | 0.1 | 0.1 |
Oracle-full
Contains all Oracle host templates listed in the Shinken Adminstrators templates.
| Warning | ||
|---|---|---|
| ||
Using the Oracle-full template could lead to slow issues. You should use the really needed templates instead. |
Shinken Administrator templates
Oracle-datafile
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| datafile-io-traffic | Sum of IO-Operationes from Datafiles per second | n/sec | 1000 | 5000 |
| datafiles-existing | Percentage of max possible datafiles | 0% to 100% | 80 | 90 |
Oracle-enqueue
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| enqueue-contention | Enqueue wait/request-Ratio | 0% to 100% | 1 | 10 |
| enqueue-waiting | How many percent of the elapsed time since the last run has an Enqueue spend with waiting | 0% to 100% | 0.00033 | 0.00033 |
Oracle-events
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| event-waits | Waits/sec from system events | n/sec | 10 | 100 |
| event-waiting | How many percent of the elapsed time has an event spend with waiting | 0% to 100% | 0.1 | 0.5 |
Oracle-flash-recovery-area
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| flash-recovery-area-usage | Used diskspace in the flash recovery area | 0% to 100% | 90 | 98 |
Oracle-latch
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| latch-contention | Latch misses/gets-ratio. With –name a Latchname or Latchnumber can be passed over. (See list-latches) | 0% to 100% | 1 | 2 |
| latch-waiting | How many percent of the elapsed time since the last run has a Latch spend with waiting | 0% to 100% | 0.1 | 1 |
Oracle-pga
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| pga-in-memory-sort-ratio | Percentage of sorts in the memory | 0% to 100% | 99: | 90: |
Oracle-redo
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| switch-interval | Interval between RedoLog File Switches | 0 to n | 600: | 60: |
| retry-ratio | Retry-Rate in the RedoLog Buffer | 0% to 100% | 1 | 10 |
| redo-io-traffic | Redolog IO in MB/sec | n/sec | 199 | 200 |
Oracle-rman
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| rman-backup-problems | Number of RMAN-errors during the last three days | 0 to n | 1 | 2 |
Oracle-rollback-segment
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| roll-header-contention | Rollback Segment Header Contention | 0% to 100% | 1 | 2 |
| roll-block-contention | Rollback Segment Block Contention | 0% to 100% | 1 | 2 |
| roll-hit-ratio | Rollback Segment gets/waits Ratio | 0% to 100% | 99: | 98: |
| roll-extends | Rollback Segment Extends n | n/sec | 1 | 100 |
| roll-wraps | Rollback Segment Wraps n | n/sec | 1 | 100 |
Oracle-sga
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| sga-data-buffer-hit-ratio | Hitrate in the Data Buffer Cache | 0% to 100% | 98: | 95: |
| sga-library-cache-gethit-ratio | Hitrate in the Library Cache (Gets) | 0% to 100% | 98: | 95: |
| sga-library-cache-pinhit-ratio | Hitrate in the Library Cache (Pins) | 0% to 100% | 98: | 95: |
| sga-library-cache-reloads | Reload-Rate in the Library Cache | n/sec | 10 | 10 |
| sga-dictionary-cache-hit-ratio | Hitrate in the Dictionary Cache | 0% to 100% | 95: | 90: |
| sga-latches-hit-ratio | Hitrate of the Latches | 0% to 100% | 98: | 95: |
| sga-shared-pool-reloads | Reload-Rate in the Shared Pool | 0% to 100% | 1 | 10 |
Oracle-soft-parse
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| soft-parse-ratio | Percentage of soft-parse-ratio | 0% to 100% | 90: | 98: |
Oracle-stale-statistics
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| stale-statistics | Sum of objects with obsolete optimizer statistics | n | 10 | 100 |
Oracle-sysstat
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| sysstat | Changes/sec for any value from v$sysstat | n/sec | 10 | 100 |
Oracle-tablespace
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| tablespace-fragmentation | Free Space Fragmentation Index | 100 to 1 | 30: | 20: |
| tablespace-can-allocate-next | Checks if there is enough free tablespace for the next Extent | |||
| tablespace-remaining-time | Sum of remaining days until a tablespace is used by 100%. The rate of increase will be calculated with the values from the last 30 days. (With the parameter –lookback different periods can be specified) | Days | 90: | 30: |
Oracle-top10-stats
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| seg-top10-logical-reads | Sum of the userprocesses under the top 10 logical reads | n | 1 | 9 |
| seg-top10-physical-reads | Sum of the userprocesses under the top 10 physical reads | n | 1 | 9 |
| seg-top10-buffer-busy-waits | Sum of the userprocesses under the top 10 buffer busy waits | n | 1 | 9 |
| seg-top10-row-lock-waits | Sum of the userprocesses under the top 10 row lock waits | n | 1 | 9 |
Oracle-usage
| Check | Detail | Check range | Default Warning | Default Critical |
|---|---|---|---|---|
| session-usage | Percentage of max possible sessions | 0% to 100% | 80 | 90 |
| Connected users | Number of currently connected users | 0 to n | 50 | 100 |
How to
Configuring host template and host for monitoring
The parameters common to every Oracle server monitored shoud be setted in the host template oracle-connection-method.
The parameters specifics to each server, such as database name for example, have to be done in the host's data itself.
| Tip | ||
|---|---|---|
| ||
It's a best practice to use the same database monitoring user on every Oracle server monitored. Doing so, you can configure the database user/password only once in the host template oracle-connection-method. |
Configuration of Oracle-connection-method host template
- select the Oracle-connection-method host template
- Clic on Data tab
- Type the oracle user in the ORACLE_USER field (shinken in the previous example)
- Type the oracle password in the ORACLE_USER field (shinkenpassword in the previous example)
Configuration of a host
- Add the Oracle choosen host template to the Host Templates to inherit field
- Clic on Data tab
- Look for the field DATABASES in the From templates datas
- Type the name of the database to be monitoring
- If not configured in the Oracle-connection-method host template
- Type the oracle user in the ORACLE_USER field (shinken in the previous example)
- Type the oracle password in the ORACLE_USER field (shinkenpassword in the previous example)
| Tip | ||
|---|---|---|
| ||
You can monitor multiple database on the same host in typing the name of the database separated by a comma. Exemple : DB1,DB2,DB3 |