Oracle® TimesTen In-Memory Database Operations Guide Release 11.2.1 Part Number E13065-03 |
|
|
View PDF |
An application using the TimesTen Data Manager should obtain an order of magnitude performance improvement in its data access over an application using a traditional DBMS. However, poor application design and tuning can erode the TimesTen advantage. This chapter discusses factors that can affect the performance of a TimesTen application. These factors range from subtle, such as data conversions, to more overt, such as preparing a command at each execution.
This chapter explains the full range of these factors, with a section on each factor indicating:
How to detect problems.
How large is the potential performance impact.
Where are the performance gains.
What are the tradeoffs.
As discussed throughout this chapter, many performance problems can be identified by examining the SYS.MONITOR table.
Topics are:
For information on tuning TimesTen Java applications, see "Application Tuning" in the Oracle TimesTen In-Memory Database Java Developer's Guide.
For information on tuning TimesTen C applications, see "Application Tuning" in the Oracle TimesTen In-Memory Database C Developer's Guide.
The following sections include tips for tuning your system and data stores:
Performance impact: Large
Configure your system so that the entire data store fits in main memory. The use of virtual memory substantially decreases performance. You will know that the data store or working set does not fit if a performance monitoring tool shows excessive paging or virtual memory activity.
You may have to add physical memory or configure the system software to allow a large amount of shared memory to be allocated to your process(es). TimesTen includes the ttSize
utility to help you estimate the size of your data store.
Performance impact: Variable
When you create a data store, you are required to specify a data store size. Specifically, you specify sizes for the permanent and temporary partitions of the data store. For details on how to size the data store and shared memory, see "Specifying the size of a data store".
Performance impact: Variable
The PL/SQL runtime system uses an area of shared memory to hold metadata about PL/SQL objects in TimesTen and the executable code for PL/SQL program units that are currently being executed or that have recently been executed. The size of this shared memory area is controlled by the PLSQL_MEMORY_SIZE first connect attribute.
When a new PL/SQL program unit is prepared for execution, it is loaded into shared memory. If shared memory space is not available, the cached recently-executed program units are discarded from memory until sufficient shared memory space is available. If all of the PL/SQL shared memory is being used by currently executing program units, then attempts by a new connection to execute PL/SQL may result in out of space errors, such as ORA-04031. If this happens, increase the PLSQL_MEMORY_SIZE.
Even if such out of space errors do not occur, the PLSQL_MEMORY_SIZE may be too small. It is less expensive in CPU time to execute a PL/SQL procedure that is cached in shared memory than one that is not cached. In a production application, the goal should be for PLSQL_MEMORY_SIZE to be large enough so that frequently-executed PL/SQL units are always cached. The TimesTen built-in procedure ttPLSQLMemoryStats
can be used to determine how often this occurs. The PinHitRatio
value returned is a real number between 0 and 1.
1.0: A value of 1.0 means that every PL/SQL execution occurred from the cache.
0.0: A value of 0.0 means that every execution required that the program unit be loaded into shared memory.
The proper value of PLSQL_MEMORY_SIZE for a given application depends on the application. If only a small number of PL/SQL program units are repeatedly executed, then the size requirements can be small. If the application uses hundreds of PL/SQL program units, memory requirements increase.
Performance increases dramatically as the PinHitRatio
goes up. In one set of experiments, an application program repeatedly executed a large number of PL/SQL stored procedures. With a larger value for PLSQL_MEMORY_SIZE, the application results in a PinHitRatio
of around 90%, and the average execution time for a PL/SQL procedure was 0.02 seconds. With a smaller value for PLSQL_MEMORY_SIZE, there was more contention for the cache, resulting in a PinHitRatio
of 66%. In this experiment the average execution time was 0.26 seconds.
The default value for PLSQL_MEMORY_SIZE is 32 MBytes. This should be sufficient for several hundred PL/SQL program units of reasonable complexity to execute. After running a production workload for some time, check the value of PinHitRatio
. If it is less than 0.90, consider increasing PLSQL_MEMORY_SIZE.
Performance impact: Large
Increasing the value of LogBufMB can have a substantial positive performance impact. If LOG_BUFFER_WAITS is increasing, then increase the value of LogBufMB.
The trade-off is that more transactions are buffered in memory and may be lost if the process crashes. If DurableCommits are enabled, increasing the default LogBufMB value does not improve performance.
Performance impact: Variable
A TimesTen data store may be permanent or Temporary. A temporary data store disappears when the last connection goes away or when there is a system or application failure.
If you do not need to save the data store to disk, you can save checkpoint overhead by creating a temporary data store.
Temporary data stores are never fully checkpointed to disk, although the log is periodically written to disk. The amount of data written to the log for temporary data stores is less than that written for permanent data stores, allowing better performance for temporary data stores. Checkpoint operations can have significant overhead for permanent data stores, depending on data store size and activity, but have very little impact for temporary data stores. Checkpoints are still necessary to remove log files.
Performance impact: Large
By default, TimesTen loads an idle data store, which is a data store with no connections, into memory when a first connection is made to it. When the final application disconnects from a data store, a delay occurs when the data store is written to disk. If applications are continually connecting and disconnecting from a data store, the data store may be loaded to and unloaded from memory continuously, resulting in excessive disk I/O and poor performance. Similarly, if you are using a very large data store you may want to pre-load the data store into memory before any applications attempt to use it.
To avoid the latency of loading a data store into memory, you can change the RAM policy of the data store to allow data stores to always remain in memory. The trade-off is that since the data store is never unloaded from memory, a final disconnect checkpoint never occurs. So, applications should checkpoint the data store explicitly in order to reduce the disk space taken up by log files.
Alternatively, you can specify that the data store remain in memory for a specified interval of time and accept new connections. If no new connections occur in this interval, TimesTen unloads the data store from memory and checkpoints it. You can also specify a setting to allow a system administrator to load and unload the data store from memory manually.
To change the RAM policy of a data store, use the ttAdmin
utility.
Performance impact: Large
When you duplicate a data store, use the -ramLoad
option of the ttAdmin
utility. This places the data store in memory, available for connections, instead of unloading it with a blocking checkpoint. See "Avoid connection overhead".
Data store contention can substantially impede application performance.
To reduce contention in your application:
Choose the appropriate locking method. See "Choose the best method of locking".
Distribute data strategically in multiple tables or data stores.
If your application suffers a decrease in performance because of lock contention and a lack of concurrency, reducing contention is an important first step in improving performance.
The LOCK_GRANTS_IMMED, LOCK_GRANTS_WAIT and LOCK_DENIALS_COND columns in the SYS.MONITOR table provide some information on lock contention:
LOCK_GRANTS_IMMED counts how often a lock was available and was immediately granted at lock request time.
LOCK_GRANTS_WAIT counts how often a lock request was granted after the requestor had to wait for the lock to become available.
LOCK_DENIALS_COND counts how often a lock request was not granted because the requestor did not want to wait for the lock to become available.
If limited concurrency results in a lack of throughput, or if response time is an issue, an application can serialize JDBC calls to avoid contention. This can be achieved by having a single thread issue all those calls. Using a single thread requires some queuing and scheduling on the part of the application, which has to trade off some CPU time for a decrease in contention and wait time. The result is higher performance for low-concurrency applications that spend the bulk of their time in the data store.
Performance impact: Medium
All of the TimesTen platform operating systems implement a dynamic file system buffer pool in main memory. If this buffer pool is allowed to be large, TimesTen and the operating system both retain a copy of the file in memory, causing some of the TimesTen shared segment to be paged out.
This behavior may not occur for data stores that are less than half of the installed memory size. On some systems, it is possible to limit the amount of main memory used by the file system. On other systems, this effect is less pronounced. On HP-UX, Solaris and Linux systems, consider using the MemoryLock attribute to specify whether the data store should be locked in memory. If used, the data store cannot be paged out.
On HP-UX, consider the settings for the kernel parameters dbc_min_pct
and dbc_max_pct
. These parameters control the minimum and maximum percent of real memory devoted to the file system. The default maximum is 50 percent. TimesTen recommends reducing the maximum to 10 percent.
Performance impact: Medium
During special operations such as initial loading, you can choose different options than you would use during normal operations. In particular, consider using data store-level locking for bulk loading; an example would be using ttBulkCp
or ttMigrate
. These choices can improve loading performance by a factor of two.
An alternative to data store-level locking is to exploit concurrency. Multiple copies of ttBulkCp -i
can be run using the -notblLock
option. Optimal batching for ttBulkCp
occurs by adding the -xp 256
option. ttMigrate
can be run with -numThreads
option to load individual or multiple tables concurrently.
Performance impact: Large
There are two versions of the TimesTen Data Manager driver for each platform, a debugging version and a production version. Unless you are actually debugging, use the production version. The debugging library can be significantly slower. See "Specify the DSN" and "Specify the ODBC driver" for a description of the TimesTen Data Manager drivers for the different platforms.
On Windows, make sure that applications that use the ODBC driver manager use a DSN that accesses the correct TimesTen driver. Make sure that direct-linked applications are linked with the correct TimesTen driver. An application can call the ODBC SQLGetInfo
function with the SQL_DRIVER_NAME
argument to determine which driver it is using.
Performance impact: Large
Both ODBC and JDBC provide a trace facility to help debug applications. For performance runs, make sure that tracing is disabled except when debugging applications.
To turn the JDBC tracing on, use:
DriverManager.setLogStream method: DriverManager.setLogStream(new PrintStream(System.out, true));
By default tracing is off. You must call this method before you load a JDBC driver. Once you turn the tracing on, you cannot turn it off for the entire execution of the application.
Performance impact: Variable
JRockit, IBM and HP provide JVMs that may perform better than the Sun JVM.
Performance impact: Large
If you are planning a replication scheme, then ensure the following:
The setting for LogBufMB should result in the value of LOG_FS_READS in the SYS.MONITOR table being 0 or close to 0. This ensures that the replication agent does not have to read any log records from disk. If the value of LOG_FS_READS is increasing, then increase the log buffer size.
CPU resources are adequate. The replication agent on the master data store will spawn a thread for every subscriber data store. Each thread reads and processes the log independently and needs adequate CPU resources to make progress.
If the sending side and receiving side of the replication scheme are mismatched in CPU power, place the replication receiver on the faster system.
Performance impact: Medium
Use the RecoveryThreads
first connection attribute to increase the number of threads that apply changes from the active master data store to the standby master data store from 1 to 2. If you set RecoveryThreads
to 2 on the standby, you should also set it to 2 on the active to maintain increased throughput if there is a failover.
You can also set RecoveryThreads
to 2 on one or more read-only subscribers in an active standby pair to increase replication throughput from the standby master data store.
Data stores must be hosted on systems that are 2-way or larger to take advantage of setting this attribute to 2.
The following sections include tips for Client/Server tuning:
Performance impact: Large
Using TimesTen Client to access data stores on a remote server machine adds network overhead to your connections. Whenever possible, write your applications to access the TimesTen Data Manager locally, and link the application directly with the TimesTen Data Manager.
By default, connections wait 10 seconds to acquire a lock. To change the timeout interval for locks, use the ttLockWait
built-in procedure.
When multiple connections access a data store simultaneously, TimesTen uses locks to ensure that the various transactions operate in apparent isolation. TimesTen supports the isolation levels described in Chapter 8, "Transaction Management and Recovery". It also supports the locking levels: data store-level locking, table-level locking and row-level locking. You can use the LockLevel
connection attribute to indicate whether data store-level locking or row-level locking should be used. Use the ttOptSetFlag
procedure to set optimizer hints that indicate whether table locks should be used. The default lock granularity is row locks.
If there is very little contention on the data store, use table-level locking. It provides better performance and deadlocks are less likely. There is generally little contention on the data store when transactions are short or there are few connections. In that case, transactions are not likely to overlap.
Table-level locking is also useful when a statement accesses nearly all the rows on a table. Such statements can be queries, updates, deletes or multiple inserts done in a single transaction.
TimesTen uses table locks only with serializable isolation. If your application specifies table locks with any other isolation levels, TimesTen overrides table-level locking and uses row locks. However, the optimizer plan may still display table-level locking hints.
Data store-level locking restricts concurrency more than table-level locking, and is generally useful only for initialization operations, such as bulk-loading, when no concurrency is necessary. It has better response-time than row-level or table-level locking, at the cost of diminished throughput.
Row-level locking is generally preferable when there are many concurrent transactions that are not likely to need access to the same row.
When using row-level locking, applications can run transactions at the SERIALIZABLE or READ_COMMITTED isolation level. The default isolation level is READ_COMMITTED. You can use the Isolation
connection attribute to specify one of these isolation levels for new connections.
When running at SERIALIZABLE transaction isolation level, TimesTen holds all locks for the duration of the transaction, so:
Any transaction updating a row blocks writers until the transaction commits.
Any transaction reading a row blocks out writers until the transaction commits.
When running at READ_COMMITTED transaction isolation level, TimesTen only holds update locks for the duration of the transaction, so:
Any transaction updating a row blocks out readers and writers of that row until the transaction commits.
Phantoms are possible. A phantom is a row that appears during one read but not during another read, or appears in modified form in two different reads, in the same transaction, due to early release of read locks during the transaction.
You can determine if there is an undue amount of contention on your system by checking for time-out and deadlock errors (errors # 6001, 6002, and 6003). Information is also available in the LOCK_TIMEOUTS and DEADLOCKS columns of the SYS.MONITOR table.
Performance impact: Variable
The TimesTen Client normally communicates with TimesTen Server using TCP/IP sockets. If both the TimesTen Client and TimesTen Server are on the same machine, client applications show improved performance by using a shared memory segment or a UNIX domain socket for inter-process communication (IPC).
To use a shared memory segment as IPC, you must set the server options in the ttendaemon.options
file. For a description of the server options, see "Modifying the TimesTen Server daemon options".
In addition, applications that use shared memory for IPC must use a logical server name for the Client DSN with ttShmHost
as the Network Address. For more information, see "Creating and configuring Client DSNs on UNIX".
This feature may require a significant amount of shared memory. The TimesTen Server pre-allocates the shared memory segment irrespective of the number of existing connections or the number of statements within all connections.
If your application is running on a UNIX machine and you are concerned about memory usage, the applications using TimesTen Client ODBC driver may improve the performance by using UNIX domain sockets for communication. The performance improvement when using UNIX domain sockets is not as large as when using ShmIPC
.
Applications that take advantage of UNIX domain sockets for local connections must use a logical server name for the Client DSN with ttLocalHost
as the Network Address. For more information, see "Creating and configuring Client DSNs on UNIX". In addition, make sure that your system kernel parameters are configured to allow the number of connections you require. See "Installation prerequisites" in the Oracle TimesTen In-Memory Database Installation Guide.
Performance impact: Variable
Enable TT_PREFETCH_CLOSE for serializable transactions in client/server applications. In serializable Isolation mode, all transactions should be committed when executed, including read-only transactions. When TT_PREFETCH_CLOSE is enabled, the server closes the cursor and commits the transaction after the server has fetched the entire result set for a read-only query. The client should still call SQLFreeStmt(SQL_CLOSE)
and SQLTransact(SQL_COMMIT)
, but the calls are executed in the client and do not require a network round trip between the client and server. TT_PREFETCH_CLOSE enhances performance by decreasing the network traffic between client and server.
Do not use multiple statement handles when TT_PREFETCH_CLOSE is enabled. The server may fetch all of the result set, commit the transaction, and close the statement handle before the client is finished, resulting in the closing of all statement handles.
The following examples show how to use the TT_PREFETCH_CLOSE option with ODBC and JDBC.
This example sets TT_PREFETCH_CLOSE with the SQLSetConnectOption
ODBC function. You can also set it with the SQLSetStmtOption
ODBC function.
SQLSetConnectOption (hdbc, TT_PREFETCH_CLOSE, TT_PREFETCH_CLOSE_ON); SQLExecDirect (hstmt, "SELECT * FROM T", SQL_NTS); while (SQLFetch (hstmt) != SQL_NO_DATA_FOUND) { // do the processing } SQLFreeStmt (hstmt, SQL_CLOSE);
This example shows how to enable the TT_PREFETCH_CLOSE option with JDBC:
con = DriverManager.getConnection ("jdbc:timesten:client:" + DSN); stmt = con.createStatement(); import com.timesten.sql ... ... con.setTtPrefetchClose(true); rs = stmt.executeQuery("select * from t"); while(rs.next()) { // do the processing } import com.timesten.sql .... try { ((TimesTenConnection)con).setTtPrefetchClose(true); } catch (SQLException) { ... } rs.close(); con.commit();
Performance impact: Large
An application can make a call to SQLTransact
with either SQL_NULL_HDBC and a valid environment handle:
SQLTransact (ValidHENV, SQL_NULL_HDBC, fType)
or a valid connection handle:
SQLTransact (SQL_NULL_HENV, ValidHDBC, fType).
If the intention of the application is simply to commit or rollback on a single connection, it should use a valid connection handle when calling SQLTransact
.
After you have determined the overall locking and I/O strategies, make sure that the individual SQL statements are executed as efficiently as possible. The following sections describe how to streamline your SQL statements:
Performance impact: Large
Verify that all statements are executed efficiently. For example, use queries that reference only the rows necessary to produce the required result set. If only col1
from table t1
is needed, use the statement:
SELECT col1 FROM t1...
instead of using:
SELECT * FROM t1...
Chapter 10, "The TimesTen Query Optimizer" describes how to view the plan that TimesTen uses to execute a statement. Alternatively, you can use the ttIsql showplan
command to view the plan. View the plan for each frequently executed statement in the application. If indexes are not used to evaluate predicates, consider creating new indexes or rewriting the statement or query so that indexes can be used. For example, indexes can only be used to evaluate WHERE clauses when single columns appear on one side of a comparison predicate (equalities and inequalities), or in a BETWEEN predicate.
If this comparison predicate is evaluated often, it would therefore make sense to rewrite
WHERE c1+10 < c2+20
to
WHERE c1 < c2+10
and create an index on c1
.
The presence of indexes does slow down write operations such as UPDATE, INSERT, DELETE and CREATE VIEW. If an application does few reads but many writes to a table, an index on that table may hurt overall performance rather than help it.
The FIRST keyword can be used to operate on a specific number of rows in the SQL statements, SELECT, UPDATE and DELETE. This attribute can improve throughput and response time. Alternatively, if an application plans to fetch at most one row for a query, and a unique index is not being used to fetch the row, the application should set SQL_MAX_ROW_COUNT to 1. See the Oracle TimesTen In-Memory Database Reference.
Occasionally, the system may create a temporary index to speed up query evaluation. If this happens frequently, it is better for the application itself to create the index. The CMD_TEMP_INDEXES column in the MONITOR table indicates how often a temporary index was created during query evaluation.
If you have implemented time-based aging for a table or cache group, create an index on the timestamp column for better performance of aging. See "Time-based aging".
Performance impact: Variable
The TimesTen Data Manager supports hash, range, and bitmap indexes. Each index structure has a different strength.
Hash indexes are created when you supply the UNIQUE HASH clause for the CREATE TABLE or ALTER TABLE statements. Hash indexes require that the table have a primary key.
Range indexes are created by default with the CREATE TABLE statement or created with the CREATE INDEX statement. Range indexes can speed up exact key lookups but are more flexible and can speed up other queries as well. Select a range index if your queries include LESS THAN or GREATER THAN comparisons. Range indexes are effective for high-cardinality data: that is, data with many possible values, such as CUSTOMER_NAME or PHONE_NUMBER.
Range indexes can also be used to speed up "prefix" queries. A prefix query has equality conditions on all but the last key column that is specified. The last column of a prefix query can have either an equality condition or an inequality condition.
Consider the following table and index definitions:
CREATE TABLE T(i1 integer, i2 integer, i3 integer, ...); CREATE INDEX IXT on T(i1, i2, i3);
The index IXT
can be used to speed up the following queries:
SELECT * FROM T WHERE i1>12; SELECT * FROM T WHERE i1=12 and i2=75; SELECT * FROM T WHERE i1=12 and i2 BETWEEN 10 and 20; SELECT * FROM T WHERE i1=12 and i2=75 and i3>30;
The index IXT
will not be used for queries like:
SELECT * FROM T WHERE i2=12;
because the prefix property is not satisfied. There is no equality condition for i1
.
The index IXT
will be used, but matching will only occur on the first two columns for queries like:
SELECT * FROM T WHERE i1=12 and i2<50 and i3=630;
Range indexes have a dynamic structure that adjusts itself automatically to accommodate changes in table size. A range index can be either unique or non-unique and can be declared over nullable columns. It also allows the indexed column values to be changed once a record is inserted. A range index is likely to be more compact than an equivalent hash index.
Bitmap indexes are created with the CREATE INDEX statement. Bitmap indexes are performant when searching and retrieving data from columns with low cardinality. Bitmap indexes are useful with equality queries, especially when using the AND and OR operators. These indexes increase the performance of complex queries that specify multiple predicates on multiple columns connected by AND and OR operators. Bitmap indexes are widely used in data warehousing environments. The environments typically have large amounts of data and ad hoc queries, but a low level of concurrent DML transactions. Bitmap indexes are compressed and have smaller storage requirements than other indexing techniques. For more details on when to use bitmap indexes, see "CREATE INDEX" in the Oracle TimesTen In-Memory Database SQL Reference.
Performance impact: Variable
TimesTen uses hash indexes to enforce primary key constraints. The number of buckets used for the hash index is determined by the PAGES parameter specified in the UNIQUE HASH ON clause of the CREATE TABLE statement. The value for PAGES should be the expected number of rows in the table divided by 256. A smaller value may result in a greater number of collisions, decreasing performance, while a larger value may provide somewhat increased performance at the cost of extra space used by the index.
If the number of values to be indexed varies dramatically, it is best to err on the side of a large index. If the size of a table cannot be accurately predicted, consider using a range index with CREATE INDEX. Also, consider the use of unique indexes when the indexed columns are large CHAR or binary values or when many columns are indexed. Unique indexes may be faster than hash indexes in these cases.
If the performance of record inserts degrades as the size of the table gets larger, it is very likely that you have underestimated the expected size of the table. You can resize the hash index by using the ALTER TABLE statement to reset the PAGES value in the UNIQUE HASH ON clause.
Performance impact: Variable
The declaration of a foreign key has no performance impact on SELECT queries, but it slows down the INSERT and UPDATE operations on the table that the foreign key is defined on and the UPDATE and DELETE operations on the table referenced by the foreign key. The slow down is proportional to the number of foreign keys that either reference or are defined on the table.
Performance impact: Large
If statistics are available on the data in the data store, the TimesTen optimizer uses them when preparing a command to determine the optimal path to the data. If there are no statistics, the optimizer uses generic guesses about the data distribution. For performance reasons, TimesTen does not hold a lock on tables or rows when computing statistics.
If you have examined the plans generated for your statements and you think they may not be optimal, consider computing statistics before preparing your statements and re-examining the plans. See Chapter 10, "The TimesTen Query Optimizer" for more information.
If you have not examined the plans, we generally recommend computing statistics since the information is likely to result in more efficient plans.
There are two built-in procedures for computing statistics: ttOptUpdateStats
and ttOptEstimateStats
.
The ttOptUpdateStats
built-in procedure evaluates every row of the table(s) in question and computes exact statistics.
The ttOptEstimateStats
procedure evaluates only a sampling of the rows of the table(s) in question and produces estimated statistics.
Estimating statistics can be faster, although it may result in less accurate statistics. In general, if time is not an issue, it is best to call ttOptUpdateStats
. Estimation is preferable if overall application performance may be affected. Computing statistics with a sample of 10 percent is about ten times faster than computing exact statistics and generally results in the same execution plans. Since computing statistics is a time-consuming operation, you should compute statistics once after loading your data store but before preparing commands, and then periodically only if the composition of your data changes substantially. It is recommended to always update statistics after loading the data store and after a large number of inserts or deletes have occurred.
Performance impact: Variable
The ALTER TABLE statement allows applications to add columns to a table and to drop columns from a table. Although the ALTER TABLE statement itself runs very quickly in most cases, the modifications it makes to the table can cause subsequent operations on the table to run more slowly. The actual performance degradation the application experiences varies with the number of times the table has been altered and with the particular operation being performed on the table.
Dropping VARCHAR and VARBINARY columns is slower than dropping columns of other data types since a table scan is required to free the space allocated to the existing VARCHAR and VARBINARY values in the column to be dropped.
Performance impact: Variable
TimesTen supports nested queries with some limitations. However, performance varies and is generally not optimal. See the Oracle TimesTen In-Memory Database SQL Reference for details on subqueries.
If you have applications that generate a statement multiple times searching for different values each time, prepare a parameterized statement to reduce compile time. For example, if your application generates statements like:
SELECT A FROM B WHERE C = 10 SELECT A FROM B WHERE C = 15
You can replace these statements with the single statement:
SELECT A FROM B WHERE C = ?
TimesTen shares prepared statements automatically after they have been committed. As a result, an application request to prepare a statement for execution may be completed very quickly if a prepared version of the statement already exists in the system. Also, repeated requests to execute the same statement can avoid the prepare overhead by sharing a previously prepared version of the statement.
Even though TimesTen allows prepared statements to be shared, it is still a good practice for performance reasons to use parameterized statements. Using parameterized statements can further reduce prepare overhead, in addition to any savings from sharing statements.
Because preparing SQL statements is an expensive operation, your application should minimize the number of calls to the prepare API. Most applications prepare a set of statements at the beginning of a connection and use that set for the duration of the connection. This is a good strategy when connections are long, consisting of hundreds or thousands of transactions. But if connections are relatively short, a better strategy is to establish a long-duration connection that prepares the statements and executes them on behalf of all threads or processes. The trade-off here is between communication overhead and prepare overhead, and can be examined for each application. Prepared statements are invalidated when a connection is closed.
See "ttSQLCmdCacheInfoGet
" in the Oracle TimesTen In-Memory Database Reference for related information.
The following sections include tips for improving performance of materialized views:
Performance impact: Variable
Larger numbers of join rows decrease performance. You can limit the number of join rows and the number of tables joined by controlling the join condition. For example, use only equality conditions that map one row from one table to one or at most a few rows from the other table.
Performance impact: Variable
Create indexes on the columns of the detail table that are specified in the SELECT statement that creates the join. Also consider creating an index on the materialized view itself. This can improve the performance of keeping the materialized view updated.
If an UPDATE or DELETE operation on a detail table is often based on a condition on a column, try to create an index on the materialized view on this column if possible.
For example, CustOrder
is a materialized view of customer orders, based on two tables. The tables are Customer
and bookOrder
. The former has two columns (custNo
and custName
) and the latter has three columns (ordNo
, book
, and custNo
). If you often update the bookOrder
table to change a particular order by using the condition bookOrder.ordNo=const
, then create an index on CustOrder.ordNo
. On the other hand, if you often update based on the condition bookOrder.custNo=const
, then create an index on CustOrder.custNo
.
If you often UPDATE using both conditions and cannot afford to create both indexes, you may want to add bookOrder.rowId
in the view and create an index on it instead. In this case, TimesTen updates the view for each detail row update instead of updating all of the rows in the view directly and at the same time. The scan to find the row to be updated is an index scan instead of a row scan, and no join rows need to be generated.
If ViewUniqueMatchScan
is used in the execution plan, it is a sign that the execution may be slower or require more space than necessary. A ViewUniqueMatchScan
is used to handle an update or delete that cannot be translated to a direct update or delete of a materialized view, and there is no unique mapping between a join row and the associated row in the materialized view. This can be fixed by selecting a unique key for each detail table that is updated or deleted.
Performance impact: Variable
Try not to update a join column or a "group by" column because this involves deleting the old value and inserting the new value.
Try not to update an expression that references more than one table. This may disallow direct update of the view because TimesTen may perform another join operation to get the new value when one value in this expression is updated.
View maintenance based on an update or delete is more expensive when:
The view cannot be updated directly. For example, not all columns specified in the detail table UPDATE or DELETE statement are selected in the view, or
There is not an indication of a one-to-one mapping from the view rows to the join rows.
For example:
CREATE MATERIALIZED VIEW v1 AS SELECT x1 FROM t1, t2 WHERE x1=x2; DELETE FROM t1 WHERE y1=1;
The extra cost comes from the fact that extra processing is needed to ensure that one and only one view row is affected due to a join row.
The problem is resolved if either x1
is UNIQUE or a unique key from t1
is included in the select list of the view. ROWID can always be used as the unique key.
Performance impact: Variable
Since outer join maintenance is more expensive when changes happen to an inner table, try to avoid changes to the inner table of an outer join. When possible, perform INSERT operations on an inner table before inserting into the associated join rows into an outer table. Likewise, when possible perform DELETE operations on the outer table before deleting from the inner table. This avoids having to convert non-matching rows into matching rows or vice versa.
Performance impact: Variable
The number of columns projected in the view SelectList
can impact performance. As the number of columns in the select list grows, the time to prepare operations on detail tables increases. In addition, the time to execute operations on the view detail tables also increases. Do not select values or expressions that are not needed.
The optimizer considers the use of temporary indexes when preparing operations on detail tables of views. This can significantly slow down prepare time, depending upon the operation and the view. If prepare time seems slow, consider using ttOptSetFlag
to turn off temporary range indexes and temporary hash scans.
The following sections describe how to increase performance when using transactions:
Each transaction, when it generates transaction log records (for example, a transaction that does an INSERT, DELETE or UPDATE), incurs a disk write when the transaction commits. Disk I/O affects response time and may affect throughput, depending on how effective group commit is.
Performance-sensitive applications should avoid unnecessary disk writes at commit. Use a performance analysis tool to measure the amount of time your application spends in disk writes (versus CPU time). If there seems to be an excessive amount of I/O, there are two steps you can take to avoid writes at commit:
Adjust the transaction size.
Adjust whether disk writes are performed at transaction commit. See "Use durable commits appropriately".
Long transactions perform fewer disk writes per unit of time than short transactions. However, long transactions also can reduce concurrency, as discussed in Chapter 8, "Transaction Management and Recovery".
If only one connection is active on a data store (for example, if it is an exclusive connection), longer transactions could improve performance. However, long transactions may have some disadvantages, such as longer rollbacks.
If there are multiple connections, there is a trade-off between transaction log I/O delays and locking delays. In this case transactions are best kept to their natural length, as determined by their requirements for atomicity and durability.
By default, each TimesTen transaction results in a disk write at commit time. This practice ensures that no committed transactions are lost because of system or application failures. Applications can avoid some or all of these disk writes by performing nondurable commits. Nondurable commits do everything that a durable commit does except write the transaction log to disk. Locks are released and cursors are closed, but no disk write is performed.
Note:
Some controllers or drivers only write data into cache memory in the controller or write to disk some time after the operating system is told that the write is completed. In these cases, a power failure may cause some information that you thought was durably committed to be lost. To avoid this loss of data, configure your disk to write to the recording media before reporting media before reporting completion or use an uninterruptible power supply.The advantage of nondurable commits is a potential reduction in response time and increase in throughput. The disadvantage is that some transactions may be lost in the event of system failure. An application can force the transaction log to disk by performing an occasional durable commit or checkpoint, thereby decreasing the amount of potentially lost data. In addition, TimesTen itself periodically flushes the transaction log to disk when internal buffers fill up, limiting the amount of data that will be lost.
Transactions can be made durable or can be made to have delayed durability on a connection-by-connection basis. Applications can force a durable commit of a specific transaction by calling the ttDurableCommit
procedure.
Applications that do not use nondurable commits can benefit from using synchronous writes in place of write and flush. To turn on synchronous writes set the first connection attribute LogFlushMethod=2
.
The XACT_D_COMMITS column of the SYS.MONITOR table indicates the number of transactions that were durably committed.
Applications that are connected to a data store for a long period of time occasionally need to call the ttCkpt
built-in procedure to checkpoint the data store so that transaction log files do not fill up the disk. Transaction-consistent checkpoints can have a significant performance impact because they require exclusive access to the data store.
It is generally better to call ttCkpt
to perform a non-blocking (or "fuzzy") checkpoint than to call ttCkptBlocking
to perform a blocking checkpoint. Non-blocking checkpoints may take longer, but they permit other transactions to operate against the data store at the same time and thus impose less overall overhead. You can increase the interval between successive checkpoints by increasing the amount of disk space available for accumulating transaction log files.
As the transaction log increases in size (if the interval between checkpoints is large), recovery time increases accordingly. If reducing recovery time after a system crash or application failure is important, frequent checkpoints may be preferable. The DS_CHECKPOINTS column of the SYS.MONITOR table indicates how often checkpoints have successfully completed.
AUTOCOMMIT mode forces a commit after each statement, and is enabled by default. Committing each statement after execution, however, can significantly degrade performance. For this reason, it is generally advisable to disable AUTOCOMMIT, using the appropriate API for your programming environment.
The XACT_COMMITS column of the SYS.MONITOR table indicates the number of transaction commits.
Note:
If you do not include any explicit commits in your application, the application can use up important resources unnecessarily, including memory and locks. All applications should do periodic commits.When transactions fail due to erroneous data or application failure, they are rolled back by TimesTen automatically. In addition, applications often explicitly rollback transactions to recover from deadlock or timeout conditions. This is not desirable from a performance point of view, as a rollback consumes resources and the entire transaction is wasted.
Applications should avoid unnecessary rollbacks. This may mean designing the application to avoid contention and checking application or input data for potential errors in advance, if possible. The XACT_ROLLBACKS column of the SYS.MONITOR table indicates the number of transactions that were rolled back.
The following sections include tips for improving performance of data store recovery after data store shutdown or system failure:
Performance impact: Large
Set the RecoveryThreads attribute to the number of indexes or CPUs to improve recovery performance.
Performance impact: Medium
Setting discovered_direct_iosz
for the Veritas file system on HP-UX improves recovery performance. Writes that are larger than discovered_direct_iosz
bypass the file system buffer cache and go directly to disk. Set discovered_direct_iosz
to at least one megabyte. See Veritas documentation for more information.
The following sections include tips for improving performance for multiple CPUs:
Performance impact: Variable
One way to determine the approximate scaling you can expect from TimesTen is to run one of the scalable demo applications, such as tptbm
, on your system.
The tptbm
application implements a multi-user throughput benchmark. It allows you to control how it executes, including options to vary the number of processes that execute TimesTen operations and the transaction mix of SELECTs, UPDATEs, and INSERTs, for example. Run tptbm -help
to see the full list of options.
By default the demo executes one operation per transaction. You can specify more operations per transaction to better model your application. Larger transactions may scale better or worse, depending on the application profile.
Run multi-processor versions of the demo to evaluate how your application can be expected to perform on systems that have multiple CPUs. If the demo scales well but your application scales poorly, you might try simplifying your application to see where the issue is. Some users comment out the TimesTen calls and find they still have bad scaling due to issues in their application.
You may also find, for example, that some simulated application data is not being generated properly, so that all the operations are accessing the same few rows. That type of localized access will greatly inhibit scalability if the accesses involve changes to the data.
See the Quick Start home page at install_dir
/quickstart.html
for additional information about tptbm
and other demo applications. Go to the ODBC link under "Sample Programs".
Performance impact: Variable
Check the LOCK_TIMEOUTS or LOCK_GRANTS_WAIT fields in the SYS.MONITOR table. If they have high values, this may indicate undue contention, which can lead to poor scaling.
Because TimesTen is quite CPU-intensive, optimal scaling is achieved by having at most one database-intensive connection per CPU. If you have a 4-CPU system or a 2-CPU system with hyperthreading, then a 4-processor application will run well, but an 8-processor application will not perform well. The contention between the active threads will be too high. The only exception to this rule is when many transactions are committed durably. In this case, the connections are not very CPU-intensive because of the increase in I/O operations to disk, and so the machine can support many more concurrent connections.
Performance impact: Variable
Read operations scale better than write operations. Make sure that the read and write balance reflects the real-life workload of your application.
Performance impact: Variable
Prepares do not scale. Make sure that you pre-prepare commands that are executed more than once. The CMD_PREPARES and CMD_REPREPARES columns of the SYS.MONITOR table indicate how often commands were prepared or automatically re-prepared due to creation or deletion of indexes. If either has a high value, modify your application to do connection pooling, so that connects and disconnects are rare events.
Connects do not scale. Make sure that you pre-prepare commands that are executed more than once. Look at the DS_CONNECTS field in the SYS.MONITOR table. If the field has a high value, modify your application to do connection pooling, so that connects and disconnects are rare events.
Performance impact: Variable
Replication and XLA operations have significant logging overhead. Replication scales best when there are a limited number of transmitters or receivers. Check your replication topology and see if you can simplify it. Generally, XLA scales best when there are a limited number of readers. If your application has numerous readers, see if you can reduce the number.
Monitor XLA and replication to ensure they are reading from the log buffer rather than from the disk. With a lot of concurrent updates, replication may not keep up. Updates are single-threaded at the subscriber. You can achieve better XLA throughput if the frequency of acknowledgements is reduced.
Estimate the number of readers and transmitters required by checking the values in the LOG_FS_READS and LOG_BUFFER_WAITS columns in the SYS.MONITOR table. The system updates this information each time a connection is made or released and each time a transaction is committed or rolled back.
Setting LogFlushMethod=2 can improve performance of RETURN TWOSAFE replication operations and RETURN RECEIPT with DURABLE TRANSMIT operations.
Performance impact: Variable
On multi-processor systems, set RecoveryThreads to minimum(number of CPUs available, number of indexes) to allow indexes to be rebuilt in parallel if recovery is necessary. If a rebuild is necessary, progress can be viewed in the user log. Setting RecoveryThreads to a number larger than the number of CPUs available can cause recovery to take longer than if it were single-threaded.
The following sections include tips for improving XLA performance:
A larger transaction log buffer size is appropriate when using XLA. When XLA is enabled, additional transaction log records are generated to store additional information for XLA. To ensure the transaction log buffer is properly sized, one can watch for changes in the SYS.MONITOR table entries LOG_FS_READS and LOG_BUFFER_WAITS. For optimal performance, both of these values should remain 0. Increasing the transaction log buffer size may be necessary to ensure the values remain 0.
Performance impact: Medium
Prefetching multiple update records at a time is more efficient than obtaining each update record from XLA individually. Because updates are not prefetched when you use AUTO_ACKNOWLEDGE mode, it can be slower than the other modes. If possible, you should design your application to tolerate duplicate updates so you can use DUPS_OK_ACKNOWLEDGE, or explicitly acknowledge updates. Explicitly acknowledging updates usually yields the best performance if the application can tolerate not acknowledging each message individually.
Performance impact: Medium
To explicitly acknowledge an XLA update, you call acknowledge
on the update message. Acknowledging a message implicitly acknowledges all previous messages. Typically, you receive and process multiple update messages between acknowledgements. If you are using the CLIENT_ACKNOWLEDGE mode and intend to reuse a durable subscription in the future, you should call acknowledge
to reset the bookmark to the last-read position before exiting.