Oracle® TimesTen In-Memory Database Error Messages and SNMP Traps Release 11.2.1 Part Number E13071-02 |
|
|
View PDF |
Simple Network Management Program (SNMP) is a protocol for network management services. Network management software typically uses SNMP to query or control the state of network devices like routers and switches. These devices sometimes also generate asynchronous alerts called Traps to inform the management systems of problems.
TimesTen cannot be queried nor controlled through SNMP. TimesTen only sends SNMP traps for certain critical events, to possibly facilitate some user recovery mechanisms. TimesTen can send traps for the following events:
Assertion failure
Death of daemons
Data store invalid
Replicated transaction failure
Data store out of space
Autorefresh transaction failure
Replication conflict resolution
File write errors
These events also cause log entries to be written by the TimesTen daemon, but exposing them through SNMP traps allows for the possibility of having some network management software take immediate action.
The SNMP data types are either INTEGER or TEXT.
ASN_INTEGER data are:
ttPid
ttDSNConn
ttDSCurSize
ttDaeInst
ttRepReceiverPort
ttDSReqSize
ttDaePid
ttDSMaxSize
ttCacheAgentPid
The rest of the variables are ASN_OCTET_STRING type.
By default, TimesTen records that data store space is low based on the partition space thresholds of PermWarnThreshold and TempWarnThreshold attributes. If the PermWarnThreshold, which defines the permanent data store memory partition threshold, is set to 90, TimesTen records a message that the permanent data store memory is full. Once the data store permanent memory becomes 10% less than the set threshold, which in this case would be 80% full, TimesTen records a second message indicating that the data store is no longer low on space.
When connecting to a data store, you can change the out of space threshold by setting the PermWarnThreshold
and TempWarnThreshold
attributes. See "PermWarnThreshold" and "TempWarnThreshold" in Oracle TimesTen In-Memory Database Reference.
SNMP traps are UDP/IP packets. Therefore, there is no guarantee of delivery, and it is not an error if there are no subscribers for the trap. TimesTen sends only SNMPv1 traps, which all network management systems should understand.
To enable SNMP trap generation, remove the line -enabled 0
in the snmp.ini
file, or change 0
to 1
on that line. TimesTen does not generate SNMP traps by default because, in the case of repeated failures, such as an application that continues to attempt to insert new rows into a full data store, the application may experience a performance slowdown due to generation of SNMP traps.
For root installations, the configuration file /var/TimesTen/snmp.ini
on UNIX systems and install_dir
\srv\info\snmp.ini
on Windows systems enables or disables trap generation, controls the community string for SNMP traps, the target host and the target port on which to listen for traps.
Note:
For non-root installations, the file isinstall_dir
/snmp.ini
, where install_dir
represents the path of the TimesTen installation.The file contents are:
Component | Description |
---|---|
enabled {0|1} |
Disable or enable SNMP trap generation. |
-community {string} |
The SNMP community string. Default is "public." |
-trapdest {host:portnumber} |
The SNMP agent hostname and port number where SNMP trap messages are received. The default host is "localhost." The default port number where the SNMP agent listens is 162.
Up to 8 destinations may be specified.in the |
-trapport {portnumber} |
To receive SNMP traps on the local machine when you do not want to use the default port, specify the portnumber with the -trapport option. The default port number is 162. If neither -trapdest or -trapport are specified, traps are sent to the default, which is localhost on the IPv4 loopback address and port 162.
You must be root to access the default SNMP port number. If you are not root, modify the port number to one that you can access. |
An optional environment variable, TT_SNMP_INI, can override the location of the snmp.ini
file. If this variable is set, it should contain the full path of the SNMP sender configuration file, which can have a name other than snmp.ini
.
Example 3-1
To send messages and set one target destination, your snmp.ini
file looks like this:
#Enable SNMP trap generation -enabled 1 #Default community is "public" -community "public" #Default trap destination is "localhost" and default destination SNMP trap port is 162 -trapdest "localhost:162"
Example 3-2
To send messages and set multiple target destinations, your snmp.ini
file looks like this:
#Enable SNMP trap generation -enabled 0 #Default community is "public" -community "public" #Default trap destination is "localhost" and default destination SNMP trap port is 162 -trapdest "localhost:162" -trapdest "pluto:10999" -trapdest "mymachine:189"
Example 3-3
To disable trap generation, your snmp.ini
file looks like this:
#Disable SNMP trap generation -enabled 0 #Default community is "public" -community "public" #Default trap destination is "localhost" and default destination SNMP trap port is 162 -trapdest "localhost:162"
If one or more of the options is not specified, or if the snmp.ini
file is missing, then the default value for each option is used.
You must have network management software to receive SNMP traps.
For demonstration purposes, TimesTen ships the program install_dir
/demo/snmp/snmptrapd
, which listens for SNMP traps and prints them out or logs them to a file. Read the install_dir
/demo/snmp/README
file for details on how to run this program.
The maximum packet size of a single trap is 1024 bytes. If there is more data than can fit into the 1024 byte limit, the trap is truncated to fit. In this case, the trap contains a ttTrapTruncated OID set to 1.
Note:
snmptrapd
is a part of the NET-SNMP project. See http://net-snmp.sourceforge.net/
It is NOT supported by TimesTen in any way. You can also use the UCD-SNMP perl module from the CPAN http://www.cpan.org/
directory to receive and act upon SNMP traps.A Management Information Base (MIB) is like a database schema. It describes the structure of the SNMP data. For more information about MIBs in general, please refer to the previously mentioned SNMP overview documents.
The MIB extension file, install_dir
/mibs/TimesTen-MIB.txt
, describes the structure of the TimesTen SNMP information.
The TimesTen OID is rooted at Private Enterprise 5549
. The complete path to root is iso.org.dod.internet.private.enterprise.TimesTen
.*
or numerically, 1.3.6.1.4.1.5549
.*
.
Every trap has a GMT timestamp of when the trap occurred, as well as the Process ID, user name (or User ID on UNIX systems) of the process, TimesTen instance name, TimesTen, release number and a trap specific Message. In addition, most traps provide additional information specific to that message. For example the ttRepAgentDiedTrap
also provides the Replication Store ID. For a list of the variables for each trap see the TimesTen-MIB.txt
file.
TimesTen SNMP traps can be categorized by severity level. The information in the trap can be of the type:
Informational
Warning
Error
The following table describes each trap and its severity level.
Trap name | Severity level | Description |
---|---|---|
ttAssertFailTrap | Error | TimesTen Assertion Failure |
ttCacheAgentDiedTrap | Error | TimesTen IMDB Cache daemon died. |
ttCacheAgentFailoverTrap | Warning | The Cache Agent detected that a connection to Oracle had been lost and has begun to recover the connection. |
ttCacheIncAutoRefFailedTrap | Error | TimesTen IMDB Cache incremental autorefresh failed. |
ttCacheAwtRtReadFailedTrap | Error | For Asynchronous Writethrough cache groups, runtime information is stored on the Oracle instance. While reading this information from Oracle, replication either could not find the runtime data table (tt_version_reppeers ) or could not find the information within the table. |
ttCacheAwtRtUpdateFailedTrap | Error | For Asynchronous Writethrough cache groups, runtime information is stored on the Oracle instance. While updating this information replication either could not find the runtime data table (tt_version_reppeers ) or could not find the information within the table. |
ttCacheRecoveryAutorefreshTrap | Warning | The Cache Agent is performing a full autorefresh. This may be needed when a change log table on Oracle was truncated because of lack of tablespace for the cache administration user. |
ttCacheValidationErrorTrap | Error | The Cache Agent has detected fatal anomalies with cache group cache-group-name that will prevent it from properly refreshing the cache group, or it has detected fatal anomalies within the refresh interval time-in-ms. Please refer to the user error log for details. |
ttCacheValidationWarnTrap | Warning | The Cache Agent has detected anomalies with cache group cache-group-name that may prevent it from properly refreshing the cache group. Please refer to the user error log for details. |
ttCacheValidationAbortedTrap | Error | The Cache Agent aborted cache group validation because of a fatal error. Please refer to the user error log for details. |
ttDSCkptFailedTrap | Error | A checkpoint has failed. Check the user error log and get view the checkpoint history using the built-in procedure ttCkptHistory. |
ttDaemonOutOfMemoryTrap | Error | Call to malloc failed in TimesTen daemon. |
ttDSDataCorruptionTrap | Error | Data store corruption error has occurred. |
ttDSGoingInvalidTrap | Error | Setting data store to invalid state. Data store invalidation usually happens when an application that is connected to the data store is killed or exits abruptly without first disconnecting from the data store.If TimesTen encounters an unrecoverable internal error during a database operation, it may also invalidate the data store.You must commit or rollback and recover the data store. |
ttDSThreadCreateFailedTrap | Error | A process (typically multi-threaded) having multiple connections to a data store exits abnormally. The subdaemon assigned to clean up the connections creates a separate thread for each connection. If creation of one of these threads fails, this trap is thrown. Thread creation may fail due to memory limitations or having too many threads in the system. After the trap is thrown, the thread creation is attempted four more times, with an increasingly longer pause between each attempt. The total time between the first and last attempt is approximately 30 seconds. If the fifth attempt fails, the data store is invalidated. |
ttFileWriteErrorTrap | Error | Error encountered during file I/O write. |
ttMainDaemonExitingTrap | Informational | Main or sub daemons exiting normally. |
ttMainDaemonDiedTrap | Error | Main or sub daemons died abnormally. This message is sent by a subdaemon when it notices that the main daemon has died. It suggests that the main daemon has been killed or has crashed.You must restart the main daemon. |
ttMainDaemonReadyTrap | Informational | Main daemon has started. |
ttMsgLogOpenFailedTrap | Error | The message log could not be opened, possibly because of a lack of privileges on the file. Check the file location and privileges. |
ttPartitionSpaceExhaustedTrap | Error | Data store partition (permanent or temporary) space is exhausted. This message is sent when either the permanent or temporary free space in the data store is exhausted. Generally this message is preceded by the ttPartitionSpaceStateTrap warning message. See "PermWarnThreshold" and "TempWarnThreshold" in Oracle TimesTen In-Memory Database Reference for information on how to set the threshold. |
ttPartitionSpaceStateTrap | Warning | Data store partition (permanent or temporary) space is transitioning from OK to low or vice versa. This message is sent when either the permanent partition or the temporary partition free space in the data store reaches a threshold or transitions back below the threshold. This message is sent only when the free space has reached the threshold specified by the PermWarnThrehold or TempWarnThreshold attribute at the time of the first connection to the data store. See "PermWarnThreshold" and "TempWarnThreshold" in Oracle TimesTen In-Memory Database Reference for information on how to set the threshold. |
ttQueryThresholdWarnTrap | Warning | A SQL query exceeded the user-specified threshold. The text of the query can be found n the user log message. The Transaction ID and the Statement ID of the query can be found both in the trap and the user log message. After issuing the trap, the query continues executing. |
ttRepAgentClockSkewTrap | Error | Replication with a peer failed due to excessive clock skew. The skew between nodes in an active standby scheme has exceeded the allowed limit of 250ms. |
ttRepAgentDiedTrap | Error | A replication agent has died abnormally. This message is sent when the main TimesTen daemon notices that a replication agent has died abnormally. This generally means that the replication agent has been killed or has crashed. |
ttRepAgentExitingTrap | Informational | Replication agent exiting normally. |
ttRepAgentStartingTrap | Informational | Replication agent starting. |
ttRepCatchupStartTrap | Warning | Indicates that TimesTen has begun to restore a master from a subscriber where bi-directional replication has been configured, after a failure. |
ttRepCatchupStopTrap | Warning | Indicates that TimesTen has restored a master data store from a subscriber, where bi-directional replication was configured. |
ttRepConflictReportStartingTrap | Informational | Indicates that conflict reporting has been restarted because the rate of conflicts has fallen below the low water mark set in the replication scheme. This trap also indicates how many conflicts went unreported during the period in which reporting was suspended. |
ttRepConflictReportStoppingTrap | Informational | Indicates that suspension of conflict reporting has occurred because the rate of conflicts has exceeded the high water mark set in the replication scheme. |
ttRepReturnTransitionTrap | Warning | Replication return receipt has been enabled or disabled on the subscriber. |
ttRepSubscriberFailedTrap | Error | Subscriber marked as failed because too much log accumulated on its behalf by the master. |
ttRepTCPFailedTrap | Error | A replication TCP connection failed. |
ttRepUpdateFailedTrap | Warning | A replication insert, update or delete operation failed. |
ttSnmpTrap_AsyncMVFailed | Warning | A refresh of Asynchronous materialized view failed. The SNMP trap includes dsname, daemon PID and viewid. If the error is due to a transient error, such as locking, the refresh may succeed in the next refresh. |
ttUnexpectedEndOfLogTrap | Error/Warning | Premature end of log file reached during a data store recovery. If your application connected with LogAutoTruncate=1 (the default), this trap represents a warning, recovery continues with error messages. If your application connected with LogAutoTruncate=0, recovery fails with error messages. |
A typical TimesTen trap, printed with snmptrapd
, looks like this:
Enterprise Specific Trap (ttDSGoingInvalidTrap) Uptime: 4:34:16 enterprises.timesten.ttSystem.ttTimeStamp = "2002-07-20 22:24:49 (GMT)" enterprises.timesten.ttSystem.ttPid = 127 enterprises.timesten.ttSystem.ttUid = "SYSTEM" enterprises.timesten.ttSystem.ttVersion = "@(#) TimesTen Revision: 11.2.1.0.0 Date: 2008/07/07 18:24:10, instance giraffe" enterprises.timesten.ttMsg, ttMesg "Data store going Invalid (from master daemon)" enterprises.timesten.ttDataStore.ttDSName = "tptbmdata1121" enterprises.timesten.ttDataStore.ttDSShmKey = "DBI39775920.0.SHM.12" enterprises.timesten.ttDataStore.ttDSNConn = 2
This trap was generated from a TimesTen daemon running on a Windows system. The hostname of the system sending the trap is printed by snmptrapd
. The Uptime field, which is required by SNMP, lists the elapsed time since the start of the process which generated this trap. In this case, the process ttsrv1121.exe
has been running for 4 hours, 34 minutes, and 16 seconds.
This specific trap is for the Database going invalid event. So additionally, it reports the data store name, shared memory key of the data store and the number of current connections to the data store.