Oracle® TimesTen In-Memory Database Troubleshooting Procedures Guide Release 11.2.1 Part Number E13075-02 |
|
|
View PDF |
This chapter describes how to troubleshoot some of the problems you may encounter when replicating data stores.
This chapter includes the following topics:
This section describes what to check if you are unable to use CREATE REPLICATION to create a replication scheme.
Possible cause | What to do |
---|---|
You do not have ADMIN privilege | You must have ADMIN privilege to use the CREATE REPLICATION or DROP REPLICATION statements. |
Incorrect data store name, host name, or element name. |
|
The local host is not part of the replication scheme. | Create the replication scheme on a host that will be part of the replication scheme. |
Replication tables defined in the CREATE REPLICATION statement do not exist. | The name, owner, and column definitions of the tables participating in the replication scheme must be identical on both the master and subscriber data stores. Use CREATE TABLE to create tables on the data store, or use the ttRepAdmin -duplicate utility or the ttRepDuplicateEx C function to duplicate the entire data store to be replicated. |
Other problems | Review the procedures and requirements described in "Defining Replication Schemes" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide. |
This section describes what to check if you are unable to use ALTER REPLICATION to alter a replication scheme.
Possible cause | What to do |
---|---|
You do not have ADMIN privilege | You must have ADMIN privilege to use the ALTER REPLICATION statement. |
Replication agent in Start state | Most ALTER REPLICATION operations are supported only when the replication agent is stopped (ttAdmin -repStop ). Stop the replication agents on both master and subscriber data stores, alter the replication scheme on both master and subscriber data stores, then restart both replication agents. |
Incorrect data store name, host name, or element name |
|
Replication table defined in the ALTER REPLICATION statement does not exist | Use CREATE TABLE to create a table on the data store. |
Other problems | Review the procedures and requirements described in "Altering Replication" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide. |
This section describes what to check if you are unable to start or stop a replication agent.
Possible cause | What to do |
---|---|
You do not have ADMIN privileges | You must have ADMIN privileges to use the ttAdmin utility or the ttRepStart or ttRepStop procedures to start or stop a replication agent. |
TimesTen daemon not started | Check the state of the TimesTen daemon, as described in "Check the TimesTen user error log". If necessary, start the TimesTen daemon as described in "Working with the Oracle TimesTen Data Manager Daemon" in the Oracle TimesTen In-Memory Database Operations Guide. |
Data store does not participate in a replication scheme. | If a data store does not participate in a replication scheme, attempts to start a replication agent for that data store will fail. Use CREATE REPLICATION to create a replication scheme for the data store. |
TimesTen can send SNMP traps for certain replication events to enable network management software to take immediate action. TimesTen can send the following traps for replication events:
ttRepAgentExitingTrap
ttRepAgentDiedTrap
ttRepAgentStartingTrap
ttRepCatchupStartTrap
ttRepCatchupStopTrap
ttRepReturnTransitionTrap
ttRepSubscriberFailedTrap
ttRepUpdateFailedTrap
These traps are described in "Diagnostics through SNMP Traps" in the Oracle TimesTen In-Memory Database Error Messages and SNMP Traps.
If you are unable to get replication working between a master and subscriber data store, the problem may be one or more of the following:
Possible cause | See... |
---|---|
TimesTen daemon and/or replication agents not running | "Check status of TimesTen daemon and replication agents" |
Master and subscriber agents not communicating | "Check that replication agents are communicating" |
Replication not in Start state | "Check replication state" |
Error in replication scheme | "Check replication scheme configuration" |
Inconsistent owner names for replication scheme and tables | "Check owner names" |
Inconsistent replication tables | "Check consistency between replicated tables" |
Use the ttStatus
utility to confirm the main TimesTen daemon is running and the replication agents are started for all of your master and subscriber data stores. The output from a simple replication scheme using a single master and subscriber data store should look like that shown in Example 6-1.
If the TimesTen daemon is running, but the replication agents are not, the output looks like that shown in Example 6-2. In this case, start the replication agents as described in "Starting and stopping the replication agents" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide.
If neither the TimesTen daemon or replication agents are running, the output looks like that shown in Example 6-3. In this case, confirm you have correctly installed TimesTen and the Data Manager service is started, as described in "TimesTen Installation" in the Oracle TimesTen In-Memory Database Installation Guide.
Example 6-1 ttStatus output for one master and one subscriber
C:\>ttStatus TimesTen status report as of Thu Jan 25 16:23:33 2007 Daemon pid 5088 port 17000 instance MYINSTANCE TimesTen server pid 4344 started on port 17002 TimesTen webserver pid 4216 started on port 17004 ------------------------------------------------------------------------ Data store c:\temp\subscriber1ds There are 12 connections to the data store Data store is in shared mode Shared Memory KEY Global\DBI45b9471c.2.SHM.2 HANDLE 0x280 Type PID Context Connection Name ConnID Process 1244 0x00d08fb0 subscriber1ds 1 Replication 4548 0x00aed2f8 LOGFORCE 4 Replication 4548 0x00b03470 TRANSMITTER 5 Replication 4548 0x00b725a8 RECEIVER 6 Replication 4548 0x00b82808 REPHOLD 2 Replication 4548 0x00b98980 REPLISTENER 3 Subdaemon 2752 0x00526768 Worker 2042 Subdaemon 2752 0x0072a758 Flusher 2043 Subdaemon 2752 0x007308c0 Checkpoint 2044 Subdaemon 2752 0x00736a28 HistGC 2046 Subdaemon 2752 0x067f02f8 Aging 2045 Subdaemon 2752 0x068364a0 Monitor 2047 Replication policy : Manual Replication agent is running. Cache agent policy : Manual ------------------------------------------------------------------------ Data store c:\temp\masterds There are 12 connections to the data store Data store is in shared mode Shared Memory KEY Global\DBI45b945d0.0.SHM.6 HANDLE 0x2bc Type PID Context Connection Name ConnID Process 5880 0x00d09008 masterds 1 Replication 3728 0x00aed570 LOGFORCE 4 Replication 3728 0x00b036e8 TRANSMITTER 5 Replication 3728 0x00b168b8 REPHOLD 3 Replication 3728 0x00b1ca20 REPLISTENER 2 Replication 3728 0x00b22b88 RECEIVER 6 Subdaemon 3220 0x00526768 Worker 2042 Subdaemon 3220 0x0072e768 Flusher 2043 Subdaemon 3220 0x007348d0 Checkpoint 2044 Subdaemon 3220 0x067b0068 Aging 2045 Subdaemon 3220 0x067c0040 Monitor 2047 Subdaemon 3220 0x068404c8 HistGC 2046 Replication policy : Manual Replication agent is running. Cache agent policy : Manual ------------------------------------------------------------------------ Data store c:\temp\demo There are no connections to the data store Replication policy : Manual Cache agent policy : Manual ------------------------------------------------------------------------ End of report
Example 6-2 Replication agent is not running
> ttStatus TimesTen status report as of Tue Oct 28 10:31:30 2006 Daemon pid 3396 port 15000 instance MYINSTANCE TimesTen server pid 3436 started on port 15002 ----------------------------------------------------------------- Data store c:\temp\subscriberds There are no connections to the data store cache agent restart policy: manual ----------------------------------------------------------------- Data store c:\temp\masterds There are no connections to the data store cache agent restart policy: manual ----------------------------------------------------------------- End of report
Use ttRepAdmin
-receiver -list
to see that the replication agents are communicating with each other. If the masterds
data store is replicating to subscriberds
, the output should look similar to the following:
Example 6-4 Check that the replication agents are communicating
> ttRepAdmin -receiver -list masterDSN Peer name Host name Port State Proto ---------------- ------------------------ ------ ------- ----- SUBSCRIBERDS MYHOST Auto Start 10 Last Msg Sent Last Msg Recv Latency TPS RecordsPS Logs ------------- ------------- ------- ------- --------- ---- 0:01:12 - 19.41 5 52 2
Use the ttReplicationStatus
procedure to check state of the subscriber data store with respect to its master. If the subscriber is in the Stop
, Pause
, or Failed
state, use the ttReplicationStatus
procedure to reset the subscriber state to Start
, as described in "Setting the replication state of subscribers" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide.
Example 6-5 Obtain status of the subscriber data store from the master data store
Use ttReplicationStatus
to obtain the status of the subscriberds
data store from its master data store, masterDSN
, enter:
> ttIsql masterDSN Command> CALL ttReplicationStatus ('subscriberds'); < SUBSCRIBERDS, MYHOST, 0, pause, 1, 10, REPSCHEME, REPL > 1 row found.
To reset state to Start
call the ttRepSubscriberStateSet
procedure:
Command> CALL ttRepSubscriberStateSet('REPSCHEME', 'REPL', 'SUBSCRIBERDS', 'MYHOST', 0) Command> CALL ttReplicationStatus ('subscriberds'); < SUBSCRIBERDS, MYHOST, 0, start, 1, 152959, REPSCHEME, REPL > 1 row found.
This section describes some procedures you can use to confirm the correct configuration of the various components in your replicated system. The basic procedure categories are:
Use ttRepAdmin
-showconfig
to confirm the configuration of your replication scheme.
What to look for:
Are all of the subscriber agents started and reported to be in the Start
state? If not, reset the agents to the Start
state. See "Setting the replication state of subscribers" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide.
Do the reported Peer names match the names given in the DataStore attributes in the DSN definitions for the replicated data stores? Replication does not work if you specified the names given for the Data Source Name attributes.
Is there anything under List of subscribers? If not, confirm the data store names you specified in the DSN definition are consistent with those you specified in your replication scheme configuration file.
Are the Host names correct? If in doubt, see "Check host names".
Are the correct table names displayed under Table details? If not, correct the table names in your replication scheme configuration file.
Example 6-6 Confirm the configuration of the replication scheme
> ttRepAdmin -showconfig masterDSN Self host "MYHOST", port auto, name "MASTERDS", LSN 4/2970276, timeout 120, threshold 0 List of subscribers ----------------- Peer name Host name Port State Proto ---------------- ------------------------ ------ ------- ----- SUBSCRIBERDS MYHOST Auto Start 10 Last Msg Sent Last Msg Recv Latency TPS RecordsPS Logs ------------- ------------- ------- ------- --------- ---- 0:01:12 - 19.41 5 52 2 List of tables and subscriptions -------------------------------- Table details ------------- Table : REPL.TAB Master Name Subscriber Name ----------- ------------- MASTERDS SUBSCRIBERDS
Check the TTREP.TTSTORES
table to confirm that replication associates the replication scheme with the local data store.
Example 6-7 Confirm that the replication scheme is associated with the local data store
Connect to the data store and enter:
SELECT * FROM ttrep.ttstores WHERE is_local_store <> 0x0;
Command> select * from ttrep.ttstores where is_local_store <> 0x0; < -5193371075573733683, MYHOST, MASTERDS, 01, 0, 0, 4, 0 > 1 row found.
There should be exactly one row returned. If more than one row is returned, contact Technical support. If no rows are returned, then none of the hosts returned by the following statement is perceived to be a local system by TimesTen replication:
SELECT DISTINCT host_name FROM ttrep.ttstores;
It may also be that none of the data store names specified in your replication scheme match those specified in your DSN descriptions.
Some hosts or IP addresses specified in a replication scheme cannot be resolved by the replication agent because:
Host names or IP addresses specified in the replication scheme are wrong or misspelled.
Host names or IP addresses cannot be resolved or found by DNS or in the /etc/hosts
file
Entries in the /etc/hosts
file are incorrectly ordered in appearance. This error is most common when multiple NICs are used. You must have root privilege to make changes to the /etc/hosts
files.
See "Configuring host IP addresses" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for details on how to configure DNS and /etc/hosts
files for host machines used for replication.
To check if a host name in the replication scheme matches the host name of the local machine, write an application to perform these tasks:
Use a gethostname
OS function call to determine the host name of the running host.
Call gethostbyname
with the output from Step 1.
Call gethostbyname
with the host name specified in the replication scheme.
Compare output of Step 2 and Step 3. If there is a match, then the running host is involved in replication. Otherwise, it is not involved in replication.
As described in "Table requirements and restrictions" and "Owner of the replication scheme and tables" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide, the owner names of your replication scheme and your replicated tables must be consistent across all participating data stores.
Check the owner name assigned to your replication scheme by calling the ttIsql
repschemes
command or by listing the contents of the TTREP.REPLICATIONS
table.
Example 6-8 shows that the replication scheme name, REPSCHEME
, has a consistent owner name (REPL
) in the data stores on both SYSTEM1
and SYSTEM2
. Example 6-9 shows the scheme name with inconsistent owner names. This can occur if you omit the owner name from the replication scheme definition and the system uses the Id of the replication scheme creator.
Example 6-8 Consistent owner names for replication scheme
On SYSTEM1
:
> ttIsql masterDSN Command> select * from ttrep.replications; < REPSCHEME , REPL , C, 0, 0, -1 > 1 row found.
On SYSTEM2
:
> ttIsql -connStr "dsn=subscriberDSN" Command> select * from ttrep.replications; < REPSCHEME , REPL , C, 0, 0, -1 > 1 row found.
Example 6-9 Inconsistent owner names for replication scheme
On SYSTEM1
:
> ttIsql masterDSN Command> select * from ttrep.replications; < REPSCHEME , SYSTEM1 , C, 0, 0, -1 > 1 row found.
On SYSTEM2
:
> ttIsql -connStr "dsn=subscriberDSN" Command> select * from ttrep.replications; < REPSCHEME , SYSTEM2 , C, 0, 0, -1 > 1 row found.
Check the owner names assigned to the tables in each data store by using the ttIsql
tables
command.
Example 6-10 Consistent table owner names
This example shows that the TAB
table has a consistent owner name (REPL
) in the data stores on both SYSTEM1
and SYSTEM2
.
Output for SYSTEM1 | Output for SYSTEM2 |
---|---|
SYS.CACHE_GROUP |
SYS.CACHE_GROUP |
SYS.COLUMNS |
SYS.COLUMNS |
SYS.COL_STATS |
SYS.COL_STATS |
SYS.INDEXES |
SYS.INDEXES |
SYS.MONITOR |
SYS.MONITOR |
SYS.PLAN |
SYS.PLAN |
SYS.TABLES |
SYS.TABLES |
SYS.TBL_STATS |
SYS.TBL_STATS |
SYS.TRANSACTION_LOG_API |
SYS.TRANSACTION_LOG_API |
REPL.TAB |
REPL.TAB |
TTREP.REPELEMENTS |
TTREP.REPELEMENTS |
TTREP.REPLICATIONS |
TTREP.REPLICATIONS |
TTREP.REPPEERS |
TTREP.REPPEERS |
TTREP.REPSTORES |
TTREP.REPSTORES |
TTREP.REPSUBSCRIPTIONS |
TTREP.REPSUBSCRIPTIONS |
TTREP.REPTABLES |
TTREP.REPTABLES |
TTREP.TTSTORES |
TTREP.TTSTORES |
Example 6-11 Inconsistent table owner names
This example shows the TAB
table with inconsistent owner names, which were automatically assigned for each host.
Output for SYSTEM1 | Output for SYSTEM2 |
---|---|
SYS.CACHE_GROUP |
SYS.CACHE_GROUP |
SYS.COLUMNS |
SYS.COLUMNS |
SYS.COL_STATS |
SYS.COL_STATS |
SYS.INDEXES |
SYS.INDEXES |
SYS.MONITOR |
SYS.MONITOR |
SYS.PLAN |
SYS.PLAN |
SYS.TABLES |
SYS.TABLES |
SYS.TBL_STATS |
SYS.TBL_STATS |
SYS.TRANSACTION_LOG_API |
SYS.TRANSACTION_LOG_API |
SYSTEM1.TAB |
SYSTEM2.TAB |
TTREP.REPELEMENTS |
TTREP.REPELEMENTS |
TTREP.REPLICATIONS |
TTREP.REPLICATIONS |
TTREP.REPPEERS |
TTREP.REPPEERS |
TTREP.REPSTORES |
TTREP.REPSTORES |
TTREP.REPSUBSCRIPTIONS |
TTREP.REPSUBSCRIPTIONS |
TTREP.REPTABLES |
TTREP.REPTABLES |
TTREP.TTSTORES |
TTREP.TTSTORES |
Replicated tables on both master and subscriber data stores must be exactly the same.
Example 6-12 Check consistency between replicated tables
This output from the user error log shows a mismatch on the number of columns for the subscriber table TTUSER.MYDSN
.
11:37:58.00 Info: REP: 9430: REP1:transmitter.c(4936): TT16136: Sending definition for table TTUSER.MYDSN (1 columns) 11:37:58.00 Info: REP: 9412: REP2:receiver.c(5928): TT16193: Adding definition for table: TTUSER.MYDSN 11:37:58.00 Info: REP: 9412: REP2:meta.c(5580):TTUSER.MYDSN ptn 0: srcoff 0, destoff 0, length 8 11:37:58.00 Info: REP: 9412: REP2:meta.c(5580):TTUSER.MYDSN ptn 1: srcoff 8, destoff 12, length 12 11:37:58.00 Err : REP: 9412: REP2:receiver.c(6203): TT16198: Table definition mismatch on number of columns for table TTUSER.MYDSN. Local definition: 2; transmitting peer: 1 11:37:58.00 Err : REP: 9412: REP2:receiver.c(6380): TT16204: Table TTUSER.MYDSN marked invalid. Will not apply transactions received for it until a valid definition is received 11:37:58.00 Err : REP: 9412: REP2:receiver.c(7200): TT16078: Table definition for ID 637068 is invalid (Original failure 11:37:58 REP2:receiver.c(6203): TT16198: Table definition mismatch on number of columns for table TTUSER.MYDSN. Local definition: 2; transmitting peer: 1) 11:37:58.00 Err : REP: 9412: REP2:receiver.c(5002): TT16187: Transaction 1173958671/2; Error: transient 0, permanent 1
Table summary is in the first heading cell.
Possible cause | See... |
---|---|
Failed subscriber | "Check replication state" |
Return-receipt timeout period too long | "Check return receipt timeout setting" |
Use the ttReplicationStatus
procedure to check state of the subscriber data store with respect to its master. If the subscriber is in the Failed
state, see "Managing data store failover and recovery" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for information on how to deal with failed data stores.
Example 6-13 Check replication state
Use ttReplicationStatus
to obtain the status of the subscriberds
data store from its master data store, masterDSN
, enter:
> ttIsql masterDSN Command> CALL ttReplicationStatus ('subscriberds'); < SUBSCRIBERDS, MYHOST, 0, failed, 1, 10, REPSCHEME, REPL > 1 row found.
Use the ttRepSyncGet
procedure to check the return receipt timeout setting. A value of -1 indicates the application is to wait until it receives an acknowledgement from the subscriber. Network latency or other issues might delay receipt of the subscriber acknowledgment. You either address these issues or use the ttRepSyncGet
procedure to reset the return receipt timeout period.
See "Checking the status of return service transactions" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for more information.
Most of this section addresses issues that may impact replication performance. Some issues, such as log buffer too small and reading from the transaction log files on disk, can impact the performance of both replication and XLA applications.
Possible cause | See... |
---|---|
Slow network | "Check network bandwidth" |
Using RETURN RECEIPT | "Check use of return receipt blocking" |
Inefficient replication scheme | "Check replication configuration" |
Log buffer too small | "Check size of log buffer" |
Frequent or inefficient disk writes | "Check durability settings" |
Reading from transaction log files on disk rather than the log buffer | "Check for reads from transaction log files" |
High rate of conflicts | "Conflict reporting slows down replication" |
Replication agents typically communicate over some type of network connection. If replicating over a network slower than 10 MB per second (such as common with a 100 Base-T Ethernet typical in a LAN), you must be careful to match the transaction load to the available bandwidth of the network. see "Network bandwidth requirements" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for details.
Unless you need receipt confirmation for all your transactions, disable RETURN RECEIPT BLOCKING. If you require receipt confirmation for some transactions, then set RETURN RECEIPT BY REQUEST and call the ttRepSyncSet
procedure to enable the return receipt service for specific transactions. See "RETURN RECEIPT BY REQUEST" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for details.
Note:
The performance degradation caused by return-receipt becomes less of an issue when multiple applications (or threads) are updating the data store. If you must use return-receipt in a transaction, you can improve the performance of your application by using multiple threads to update the data store. Though each thread must block for receipt confirmation, the other threads are free to make updates.In addition to return-receipt setting described above, other factors related to the configuration of your replication scheme could impact replication performance. As described in "Performance and recovery trade-offs" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide, you often have to weigh the ability to efficiently failover and recover a data store against replication performance.
Topics that might be of interest include:
Setting your log buffer too small may impact replication performance. Instead, Set the LogBufMB
DSN attribute to a larger size. For more information on this DSN attribute, see "Setting attributes for logging" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide
You can improve replication performance by setting TRANSMIT NONDURABLE on the replication ELEMENT to eliminate the flush-log-to-disk operation from the replication cycle. See "Setting transmit durability on data store elements" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for details.
Enabling the DURABLE COMMIT option in your replication scheme also impacts performance. See "DURABLE COMMIT" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide for more information.
In some situations a "log reader," such as a master replication agent 'transmitter' thread or a ttXlaNextUpdate
call in an XLA application, may not be able to keep up with the update rate of the applications writing to the data store. Normally, replication and XLA readers get update records from the log buffer in memory. When the readers fall behind the application update rate, transaction log files can accumulate on the disk until the backlog can be cleared. This forces the readers to read transactions from the transaction log files on disk, which is much slower. Should you detect reads from the transaction log files, you may want to respond by decreasing the rate of application updates to that sustainable by the log readers.
Applications can monitor whether log readers are obtaining update records from transaction log files on disk rather than from the log buffer in memory by tracking the SYS.MONITOR table entry LOG_FS_READS. For example, you can check the value of LOG_FS_READS for the data store, MASTERDSN
, with the following ttIsql
command:
% ttIsql -v1 -e "select log_fs_reads from monitor; quit;" -connStr dsn=MASTERDSN
If the LOG_FS_READS counter is increasing, the log readers are falling behind or clearing out a backlog in the transaction log files.
For more complete monitoring of replication progress, create a simple shell script like the following:
!/bin/sh trap exit 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 DSN=$1 while [ 1 ] ; do date ttRepAdmin -receiver -list -connStr dsn=$DSN echo -n "Log reads from disk: " ttIsql -v1 -e "select log_fs_reads from monitor; quit;" -connStr dsn=$DSN echo ttRepAdmin -bookmark -connStr dsn=$DSN sleep 15 done
Example 6-14 Check the status of the transaction log
For example, you name the above script monitorLog
and your replication scheme replicates from the MASTERDSN
data store to the SUBSCRIBER1DSN
data store. You can then check the status of the transaction log by entering:
$ monitorLog masterdsn
This generates output similar to the following:
Mon Aug 2 10:44:40 2004 Peer name Host name Port State Proto ---------------- ------------------------ ------ ------- ----- SUBSCRIBER1DSN MYHOST Auto Start 12 Last Msg Sent Last Msg Recv Latency TPS RecordsPS Logs ------------- ------------- ------- ------- --------- ---- 00:00:05 - -1.00 -1 -1 1 Log reads from disk: < 0 > Replication hold LSN ...... 10/2656136 Last written LSN .......... 10/4015824 Last LSN forced to disk ... 10/3970152
The output from the script displays an updated status every 15 seconds until you enter Ctrl-C to exit.
Following the date in the output in Example 6-14 is the name of the subscriber, its host, and so on. Next is latency and rate information, as well as the number of transaction log files being retained on behalf of this subscriber. The specific meaning of each value is described in "From the command line: ttRepAdmin -receiver -list" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide. The main interest here is the 'Last Msg Sent' and 'Logs' values. The 'Last Msg Sent' value indicates the elapsed time since the last message was sent by the master to the subscriber and 'Logs' indicates how many transaction log files behind the replication log reader is from the current log insertion point used by the writers (Last written LSN).
Normally the 'Logs' value should be '1', as shown in Example 6-14. A steadily increasing 'Logs' value indicates latency is increasing and eventually log reads are satisfied from disk.
Note:
If the LogBufMB is larger than the LogFileSize, an increase in the 'Logs' value does not necessarily mean the log readers are reading from the transaction log files. This is because the log manager does not allow more than one log file's worth of data to be outstanding before writing it to the file system. After the log manager writes the data, the data remains in the log buffer to be read directly by the log readers. So, when the LogBufMB is larger than the LogFileSize, the 'Logs" value alone may not be the best measure of whether log readers are reading from memory or from disk.The output from:
ttRepAdmin -bookmark -connStr dsn=$DSN
displays the number of the transaction log files and the location of the bookmarks set by the log manager, as described in "From the command line: ttRepAdmin -bookmark" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide. The difference between the Replication hold LSN and the last written LSN indicates the number of records in the transaction log that have not yet been transmitted to the subscribers. A steady increase in the difference between these values is another indication that replication latency is increasing and log file reads are likely to occur.
Example 6-15 Log reader must read from transaction log files
In this example, assume the LogBufMB
is 16MB and the LogFileSize
is 8MB. The following output indicates the log reader is approximately 1.8 MB behind the capacity of the log buffer and must read from the transaction log files, 14 and 15.
Peer name Host name Port State Proto ---------------- ------------------------ ------ ------- ----- SUBSCRIBER1DSN MYHOST Auto Start 12 Last Msg Sent Last Msg Recv Latency TPS RecordsPS Logs ------------- ------------- ------- ------- --------- ---- 00:00:03 - -1.00 -1 -1 4 Log reads from disk: <20> Replication hold LSN ...... 14/7007464 Last written LSN .......... 17/465336 Last LSN forced to disk ... 17/456152
This section includes the following topics:
If you connected to your new subscriber DSN before running ttRepAdmin
-duplicate
, the data store has already been created. In this situation, -duplicate
returns:
Error : Restore not done : The datastore already exists. Unable to restore datastore locally
Confirm the existence of the data store by running ttStatus
and checking to see if the data store is in the returned list. If the new subscriber data store exists, destroy it and try ttRepAdmin
-duplicate
again:
> ttDestroy /tmp/newstore > ttRepAdmin -dsn newstoreDSN -duplicate -name newstore -from masterds -host "server1"
If you have made an error entering the subscriber data store name or host name in the replication scheme, you may see something like the following:
Unable to swap datastore locally No receiver NEWSTORE on SERVER2 found to swap with
If you have more than one scheme specified in your TTREP.REPLICATIONS
table, some ttRepAdmin
commands may return the error:
Must specify -scheme to identify which replication scheme to use
To check the names of the replication schemes used by your data store, use the ttIsql
utility to connect, and enter:
Command> SELECT * from TTREP.REPLICATIONS;
Example 6-16 Two replication schemes assigned to the data store
This example shows that two replication schemes, REPSCHEME1
and REPSCHEME2
, are assigned to the data store associated with subDSN
. In this case, it is necessary to use the ttRepAdmin
-scheme
option.
> ttIsql -connStr "dsn=subDSN" Command> SELECT * from TTREP.REPLICATIONS; < REPSCHEME1 , REPL , C, 0, 0, -1 > < REPSCHEME2 , REPL , C, 0, 0, -1 > 2 rows found. Command> exit > ttRepAdmin -dsn subDSN -receiver -list -scheme REPSCHEME1 Peer name Host name Port State Proto ---------------- ------------------------ ------ ------- ----- SUBSCRIBER1 MYHOST Auto Start 10 Last Msg Sent Last Msg Recv Latency TPS RecordsPS Logs ------------- ------------- ------- ------- --------- ---- 0:01:12 - 19.41 5 52 2
This section includes the following topics:
When attempting to set CHECK CONFLICTS for an element in a CREATE REPLICATION statement, you may encounter an error similar to the following:
8004: Column REPL.TABS.TS cannot be used for replication timestamp checking if in an index or added by ALTER TABLE; and must be binary(8) with NULL values allowed.
In this situation, check:
That the timestamp column in the specified table is a nullable column of type BINARY(8). In the above example, the TS column in the REPL.TAB
table should have a type of BINARY(8).
The timestamp column is defined in the original CREATE TABLE statement, rather than added later using ALTER TABLE.
You may receive an error similar to the following:
2208: Column TS does not exist in table.
In this situation, confirm that you have specified the correct name for the timestamp COLUMN in the CHECK CONFLICTS clause and that it exists in the specified table.
Also, make sure the timestamp column is not part of a primary key or index.
If you have configured replication to check conflicts, TimesTen sends reports to the local host. You can also configure a report file. See "Diagnostics through SNMP Traps" in the Oracle TimesTen In-Memory Database Error Messages and SNMP Traps.
If there is a large number of conflicts in a short period of time, subscriber performance can slow down because of the reporting requirements. You can use store attributes in the CREATE REPLICATION or ALTER REPLICATION statements to suspend and resume conflict reporting at specified rates of conflict:
CONFLICT REPORTING SUSPEND AT rate
CONFLICT REPORTING RESUME AT rate
Information about conflicts that occur while reporting is suspended cannot be retrieved.
See "Conflict reporting" in the Oracle TimesTen In-Memory Database TimesTen to TimesTen Replication Guide.