Oracle® Real Application Clusters Administration and Deployment Guide 11g Release 2 (11.2) Part Number E10718-03 |
|
|
View PDF |
This chapter briefly describes database design and deployment techniques for Oracle Real Application Clusters (Oracle RAC) environments. It also describes considerations for high availability and provides general guidelines for various Oracle RAC deployments.
This chapter includes the following topics:
Many customers implement Oracle RAC to provide high availability for their Oracle Database applications. For true high availability, you must make the entire infrastructure of the application highly available. This requires detailed planning to ensure there are no single points of failure throughout the infrastructure. For example, even though Oracle RAC makes your database highly available, if a critical application becomes unavailable, then your business can be negatively affected. For example, if you choose to use the Lightweight Directory Access Protocol (LDAP) for authentication, then you must make the LDAP server highly available. If the database is up but the users cannot connect to the database because the LDAP server is not accessible, then the entire system is down in the eyes of your users.
For mission critical systems, you must be able to perform failover and recovery, and your environment must be resilient to all types of failures. To reach these goals, start by defining service level requirements for your business. The requirements should include definitions of maximum transaction response time and recovery expectations for failures within the datacenter (such as for node failure) or for disaster recovery (if the entire data center fails). Typically, the service level objective is a target response time for work, regardless of failures. Determine the recovery time for each redundant component. Even though you may have hardware components that are running in an active/active mode, do not assume that losing one component cannot result in downtime for other hardware components while the faulty components are being repaired. Also, when components are running in active/passive mode, perform regular tests to validate the failover time. For example, recovery times for storage channels can take minutes. Ensure that the outage times are within your business' service level agreements, and where they are not, work with the hardware vendor to tune the configuration and settings.
When deploying mission critical systems, the testing should include functional testing, destructive testing, and performance testing. Destructive testing includes the injection of various faults in the system to test the recovery and to make sure if fits in the service level requirements. It also allows the creation of operational procedures for the production system.
To help you design and implement a mission critical or highly available system, Oracle provides a range of solutions that fit every organization regardless of size. Small workgroups and global enterprises alike are able to extend the reach of their critical business applications. With Oracle and the Internet, applications and their data are now reliably accessible everywhere, at any time. The Oracle Maximum Availability Architecture (MAA) is the Oracle best practices blueprint that is based on proven Oracle high availability technologies and recommendations. The goal of the MAA is to remove the complexity in designing an optimal high availability architecture.
See Also:
Oracle Maximum Availability Architecture (MAA) Web site at
http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
This section includes the following topics:
Best Practices for Deploying Oracle RAC in a High Availability Environment
Consolidating Multiple Applications in a Database or Multiple Databases in a Cluster
Applications can take advantage of many Oracle Database, Oracle Clusterware, and Oracle RAC features and capabilities to minimize or mask any failure in the Oracle RAC environment. For example:
Remove TCP/IP timeouts by using the VIP address to connect to the database.
Create detailed operational procedures and ensure you have the appropriate support contracts in place to match defined service levels for all components in the infrastructure.
Take advantage of the Oracle RAC Automatic Workload Management features such as connect time failover, Fast Connection Failover, Fast Application Notification, and the Load Balancing Advisory. See Chapter 4, "Introduction to Automatic Workload Management" for more details.
Place voting disks on separate volume groups to mitigate outages due to slow I/O throughput. To survive the failure of x voting devices, configure 2x + 1 mirrors.
Place the OCR with I/O service times in the order of 2 ms or less.
Tune database recovery using the FAST_START_MTTR_TARGET
initialization parameter.
Use Oracle Automatic Storage Management (Oracle ASM) to manage database storage.
Ensure that strong change control procedures are in place.
Check the surrounding infrastructure for high availability and resiliency, such as LDAP, NIS, and DNS. These entities affect the availability of your Oracle RAC database. If possible, perform a local backup procedure routinely.
Use Oracle Enterprise Manager to administer your entire Oracle RAC environment, not just the Oracle RAC database. Use Oracle Enterprise Manager to create and modify services, and to start and stop the cluster database instances and the cluster database. See the Oracle Database 2 Day + Real Application Clusters Guide for more information using Oracle Enterprise Manager in an Oracle RAC environment.
Use Recovery Manager (RMAN) to back up, restore, and recover data files, control files, server parameter files (SPFILEs) and archived redo log files. You can use RMAN with a media manager to back up files to external storage. You can also configure parallelism when backing up or recovering Oracle RAC databases. In Oracle RAC, RMAN channels can be dynamically allocated across all of the Oracle RAC instances. Channel failover enables failed operations on one node to continue on another node. You can use RMAN in Oracle RAC from Oracle Enterprise Manager Backup Manager or from a command line. See Chapter 5, "Configuring Recovery Manager and Archiving" for more information about using RMAN.
If you use sequence numbers, then always use CACHE
with the NOORDER
option for optimal sequence number generation performance. With the CACHE
option, however, you may have gaps in the sequence numbers. If your environment cannot tolerate sequence number gaps, then use the NOCACHE
option or consider pre-generating the sequence numbers. If your application requires sequence number ordering but can tolerate gaps, then use CACHE
and ORDER
to cache and order sequence numbers in Oracle RAC. If your application requires ordered sequence numbers without gaps, then use NOCACHE
and ORDER
. This combination has the most negative effect on performance compared to other caching and ordering combinations.
If you use indexes, then consider alternatives, such as reverse key indexes to optimize index performance. Reverse key indexes are especially helpful if you have frequent inserts to one side of an index, such as indexes that are based on insert date.
Many people want to consolidate multiple applications in a single database or consolidate multiple databases in a single cluster. Oracle Clusterware and Oracle RAC support both types of consolidation.
Creating a cluster with a single pool of storage managed by Oracle ASM provides the infrastructure to manage multiple databases whether they are single instance database or Oracle RAC databases.
With Oracle RAC databases, you can adjust the number of instances and which nodes run instances for a given database, based on workload requirements. Features such as cluster-managed services allow you to manage multiple workloads on a single database or across multiple databases. It is important to properly manage the capacity in the cluster when adding work. The processes that manage the cluster—including processes both from Oracle Clusterware and database—must be able to obtain CPU resources in a timely fashion and must be given higher priority in the system.
Oracle recommends that the number of real time LMSn processes on a server is less than or equal to the number of processors. (Note that this is the number of recognized CPUs that includes cores. For example, a dual core CPU is considered to be two CPUs). It is important that you load test your system when adding instances on a node to ensure you have enough capacity to support the workload.
If you are consolidating many small databases into a cluster, you may want to reduce the number of Global Cache Service Processes (LMSn) created by the Oracle RAC instance. By default, Oracle Database calculates the number of processes based on the number of CPUs it finds on the server. This calculation may result in more LMSn processes than is needed for the Oracle RAC instance. One LMS process may be sufficient for up to 4 CPUs.To reduce the number of LMSn processes, set the GC_SERVER_PROCESSES
initialization parameter minimally to a value of 1. Add a process for every four CPUs needed by the application. In general, it is better to have fewer busy LMSn processes. Oracle Database calculates the number of processes when the instance is started, and you must restart the instance if you want to change the value.
Oracle RAC provides concurrent, transactionally consistent access to a single copy of the data from multiple systems. It provides scalability beyond the capacity of a single server. If your application scales transparently on symmetric multiprocessing (SMP) servers, then it is realistic to expect the application to scale well on Oracle RAC, without the need to make changes to the application code.
Traditionally, when a database server runs out of capacity, it is replaced with a new larger server. As servers grow in capacity, they become more expensive. However, for Oracle RAC databases, you have alternatives for increasing the capacity:
You can migrate applications that traditionally run on large SMP servers to run on clusters of small servers.
You can maintain the investment in the current hardware and add a new server to the cluster (or create or add a new cluster) to increase the capacity.
Adding servers to a cluster with Oracle Clusterware and Oracle RAC does not require an outage. As soon as the new instance is started, the application can take advantage of the extra capacity.
All servers in the cluster must run the same operating system and same version of Oracle Database but the servers do not have to be exactly the same capacity. With Oracle RAC, you can build a cluster that fits your needs, whether the cluster is made up of servers where each server is a two CPU commodity server, to clusters where the servers have 32 or 64 CPUs in each server. The Oracle parallel execution feature allows a single SQL statement to be divided up into multiple processes, where each process completes a subset of work. In an Oracle RAC environment, you can define the parallel processes to run only on the instance where the user is connected or to run across multiple instances in the cluster.
This section briefly describes database design and deployment techniques for Oracle RAC environments. It also describes considerations for high availability and provides general guidelines for various Oracle RAC deployments.
Consider performing the following steps during the design and development of applications that you are deploying on an Oracle RAC database:
Tune the design and the application
Tune the memory and I/O
Tune contention
Tune the operating system
Note:
If an application does not scale on an SMP system, then moving the application to an Oracle RAC database cannot improve performance.Consider using hash partitioning for insert-intensive online transaction processing (OLTP) applications. Hash partitioning:
Reduces contention on concurrent inserts into a single database structure
Affects sequence-based indexes when indexes are locally partitioned with a table and tables are partitioned on sequence-based keys
Is transparent to the application
If you hash partitioned tables and indexes for OLTP environments, then you can greatly improve performance in your Oracle RAC database. Note that you cannot use index range scans on an index with hash partitioning.
If you are using sequence numbers, then always use the CACHE
option. If you use sequence numbers with the CACHE
option, then:
Your system may lose sequence numbers
There is no guarantee of the ordering of the sequence numbers
Note:
If your environment cannot tolerate sequence number gaps, then consider pre-generating the sequence numbers or use theORDER
and CACHE
options.This section describes considerations when deploying Oracle RAC databases. Oracle RAC database performance is not compromised if you do not employ these techniques. If you have an effective single-instance design, then your application will run well on an Oracle RAC database.
This section includes the following topics:
In addition to using locally managed tablespaces, you can further simplify space administration by using automatic segment space management (ASSM) and automatic undo management.
ASSM distributes instance workloads among each instance's subset of blocks for inserts. This improves Oracle RAC performance because it minimizes block transfers. To deploy automatic undo management in an Oracle RAC environment, each instance must have its own undo tablespace.
As a general rule, only use DDL statements for maintenance tasks and avoid executing DDL statements during peak system operation periods. In most systems, the amount of new object creation and other DDL statements should be limited. Just as in single-instance Oracle databases, excessive object creation and deletion can increase performance overhead.
If you add nodes to your Oracle RAC database environment, then you may need to increase the size of the SYSAUX
tablespace. Conversely, if you remove nodes from your cluster database, then you may be able to reduce the size of your SYSAUX
tablespace.
See Also:
Your platform-specific Oracle RAC installation guide for guidelines about sizing theSYSAUX
tablespace for multiple instancesIf you are running XA Transactions in an Oracle RAC environment and the performance is poor, direct all branches of a tightly coupled distributed transaction to the same instance.
To ensure this, create multiple Oracle Distributed Transaction Processing (DTP) services, with one or more on each Oracle RAC instance. Each DTP service is a singleton service that is available on one and only one Oracle RAC instance. All access to the database server for distributed transaction processing must be done by way of the DTP services. Ensure that all of the branches of a single global distributed transaction use the same DTP service. In other words, a network connection descriptor, such as a TNS name, a JDBC URL, and so on, must use a DTP service to support distributed transaction processing.
See Also:
"Services and Distributed Transaction Processing in Oracle RAC" for more details about enabling services and distributed transactions
Oracle Database Advanced Application Developer's Guide for more information about distributed transactions in Oracle RAC
Cache Fusion makes Oracle RAC databases the optimal deployment servers for online transaction processing (OLTP) applications. This is because these types of applications require:
High availability in the event of failures
Scalability to accommodate increased system demands
Load balancing according to demand fluctuations
The high availability features of Oracle Database and Oracle RAC can re-distribute and load balance workloads to surviving instances without interrupting processing. Oracle RAC also provides excellent scalability so that if you add or replace a node, then Oracle Database re-masters resources and re-distributes processing loads.
To accommodate the frequently changing workloads of online transaction processing systems, Oracle RAC remains flexible and dynamic despite changes in system load and system availability. Oracle RAC addresses a wide range of service levels that, for example, fluctuate due to:
Varying user demands
Peak scalability issues like trading storms (bursts of high volumes of transactions)
Varying availability of system resources
This section discusses how to deploy data warehouse systems in Oracle RAC environments by briefly describing the data warehouse features available in shared disk architectures.
This section includes the following topics:
Oracle RAC is ideal for data warehouse applications because it augments the single instance benefits of Oracle Database. Oracle RAC does this by maximizing the processing available on all of the nodes that belong to an Oracle RAC database to provide speed-up for data warehouse systems.
The query optimizer considers parallel execution when determining the optimal execution plans. The default cost model for the query optimizer is CPU+I/O and the cost unit is time. In Oracle RAC, the query optimizer dynamically computes intelligent defaults for parallelism based on the number of processors in the nodes of the cluster. An evaluation of the costs of alternative access paths, table scans versus indexed access, for example, takes into account the degree of parallelism (DOP) available for the operation. This results in Oracle Database selecting the execution plans that are optimized for your Oracle RAC configuration.
Oracle Database's parallel execution feature uses multiple processes to run SQL statements on one or more CPUs. Parallel execution is available on both single-instance Oracle databases and Oracle RAC databases.
Oracle RAC takes full advantage of parallel execution by distributing parallel processing across all available instances. The number of processes that can participate in parallel operations depends on the DOP assigned to each table or index.
See Also:
Oracle Database Performance Tuning Guide for more information about the query optimizerThis section describes the following two Oracle RAC security considerations:
Wallets used by Oracle RAC instances for Transparent Data Encryption may be one of: a local copy of a common wallet, a common wallet stored on shared media that is directly accessed, or a hardware-security module (HSM) based wallet. In every case, you must open the wallet on each Oracle RAC instance before transparent data encryption can be used.
Consider the following when setting up the wallet on Oracle RAC:
Ensure that only one Oracle RAC instance sets the Master Key. Once the Master Key is set, manually copy the wallet to all instances of the Oracle RAC database to the locations indicated by the sqlnet.ora
file of each instance. Reopen the wallet on all of the instances after completing the copying of the wallet. Only after all of the Oracle RAC instances have reopened the wallet can you run a Master Key Rekey
command.
Ensure that only one Oracle RAC instance issues the Master Key Rekey
command at any given time. After the instance has issued the Master Key Rekey
command, copy the wallet to all instances of the Oracle RAC database. Again, reopen the wallet after it has been copied. Only after all the instances have reopened the wallet can you run another Master Key Rekey
command.
Do not issue any wallet open
or close
commands while setting up or changing the Master Key.
A wallet must be opened on all instances of the Oracle RAC database in order to have the wallet open in Oracle RAC.
A wallet must be closed on all instances of the Oracle RAC database in order to have the wallet closed in Oracle RAC.
Deployments where no shared storage exists require that each Oracle RAC node maintain a local wallet.
As an alternative to copying the wallet to each node, you can use a hardware-security module (HSM) device to store the master encryption key. HSM devices are recommended over shared storage on a network drive, because an HSM device automatically clones the master key.
Note:
For shared access to a key, Oracle recommends using HSM devices. Oracle does not recommend using a shared copy residing on a network file system because when one instance re-keys the master key, other instances are not notified about the change and use the old master key, which can no longer decrypt the column keys. If you need to re-key an Oracle RAC environment, you must copy the wallet to all of the remaining instances and then close and reopen the wallet on each instance.After you create and provision a wallet on a single node, you must copy the wallet and make it available to all of the other nodes, as follows:
For systems using Transparent Data Encryption with encrypted wallets, you can use any standard file transport protocol, though Oracle recommends using a secured file transport.
For systems using Transparent Data Encryption with obfuscated wallets, file transport through a secured channel is recommended.
To specify the directory in which the wallet must reside, set the WALLET_LOCATION
or ENCRYPTION_WALLET_LOCATION
parameter in the sqlnet.ora
file. The local copies of the wallet need not be synchronized for the duration of Transparent Data Encryption usage until the server key is re-keyed though the ALTER SYSTEM SET KEY
SQL statement. Each time you issue the ALTER SYSTEM SET KEY
statement on a database instance, you must again copy the wallet residing on that node and make it available to all of the other nodes. Then, you must close and reopen the wallet on each of the nodes. To avoid unnecessary administrative overhead, reserve re-keying for exceptional cases where you believe that the server master key may have been compromised and that not re-keying it could cause a serious security problem.
See Also:
Oracle Database Advanced Security Administrator's Guide for more information about creating and provisioning a walletBy default, all installations of Windows Server 2003 Service Pack 1 and higher enable the Windows Firewall to block virtually all TCP network ports to incoming connections. As a result, any Oracle products that listen for incoming connections on a TCP port will not receive any of those connection requests, and the clients making those connections will report errors.
Depending upon which Oracle products you install and how they are used, you may need to perform additional Windows post-installation configuration tasks so that the Firewall products are functional on Windows Server 2003.
This section includes these topics:
Table 10-1 lists the Oracle Database 10g release 1 (10.1) or later executables that listen on TCP ports on Windows. If they are in use and accepting connections from a remote client computer, then Oracle recommends that you add them to the Windows Firewall exceptions list to ensure correct operation. Except as noted, they can be found in ORACLE_HOME
\bin
.
The RMI registry application and daemon executable listed in Table 10-1 are used by Oracle Ultra Search to launch a remote crawler. They must be added to the Windows Firewall exception list if you are using the Ultra Search remote crawler feature, and if the remote crawler is running on a computer with the Windows Firewall enabled.
Note:
If multiple Oracle homes are in use, then several firewall exceptions may be needed for the same executable: one for each home from which that executable loads.Table 10-1 Oracle Executables Requiring Windows Firewall Exceptions
File Name | Executable Name |
---|---|
|
OracleCRService |
|
OracleEVMService |
|
Event manager logging daemon |
|
Oracle Object Link Manager (GUI version) |
|
OracleCSService |
|
Oracle Object Link Manager (CLI version) |
|
Virtual Internet Protocol Configuration Assistant |
|
Oracle Cluster Volume Service |
|
Data Guard Manager |
|
Oracle Database Control |
|
External Procedures |
|
Generic Connectivity |
|
Oracle Internet Directory LDAP Server |
|
Oracle Services for Microsoft Transaction Server |
|
Oracle Database |
|
Apache Web Server |
|
Oracle Enterprise Manager Agent |
|
Oracle Services for Microsoft Transaction Server |
|
Oracle |
|
RACG |
|
Oracle listener |
|
Java Virtual Machine |
|
RMI daemon executable |
|
RMI registry application |
|
RMI registry application |
|
Oracle Notification Service |
|
Oracle Process Manager |
|
Oracle Procedural Gateway for APPC |
|
Oracle Procedural Gateway for Websphere MQ |
|
Oracle Procedural Gateway for Websphere MQ |
|
Oracle Procedural Gateway for APPC |
|
Oracle Transparent Gateway for DRDA |
|
Oracle Transparent Gateway for MS-SQL Server |
|
Oracle Transparent Gateway for SYBASE |
|
Oracle Transparent Gateway for Teradata |
|
Oracle TNS listener |
|
System file for Oracle Cluster File System |
Post-installation configuration for the Windows Firewall must be undertaken if all of the following conditions are met:
Oracle server-side components are installed.
These components include Oracle Database, network listeners, and any Web servers or services.
The computer services connections from other computers over a network.
If no other computers connect to the computer with the Oracle software, then no post-installation configuration steps are required and the Oracle software will function as expected.
The Windows Firewall is enabled.
If the Windows Firewall is not enabled, then no post-installation configuration steps are required.
You can configure Windows Firewall by opening specific static TCP ports in the firewall or by creating exceptions for specific executables so that they are able to receive connection requests on any ports they choose. To configure the firewall, choose Control Panel > Windows Firewall > Exceptions or enter netsh firewall add...
at the command line.
Alternatively, Windows will inform you if a foreground application is attempting to listen on a port, and it will ask you if you wish to create an exception for that executable. If you choose to do so, then the effect is the same as creating an exception for the executable either in the Control Panel or from the command line.
If you cannot establish certain connections even after granting exceptions to the executables listed in Table 10-1, then follow these steps to troubleshoot the installation:
Examine Oracle configuration files (such as *.conf
files), the Oracle key in the Windows registry, and network configuration files in ORACLE_HOME
\network\admin
.
Pay particular attention to any executable listed in ORACLE_HOME
\network\admin\listener.ora
in a PROGRAM=
clause. Each of these must be granted an exception in the Windows Firewall, because a connection can be made through the TNS listener to that executable.
Examine Oracle trace files, log files, and other sources of diagnostic information for details on failed connection attempts. Log and trace files on the database client computer may contain useful error codes or troubleshooting information for failed connection attempts. The Windows Firewall log file on the server may contain useful information as well.
If the preceding troubleshooting steps do not resolve a specific configuration issue on Windows XP Service Pack 2, then provide the output from command netsh firewall show state verbose=enable
to Oracle Support for diagnosis and problem resolution.
See Also:
http://www.microsoft.com/downloads/details.aspx?FamilyID=a7628646-131d-4617-bf68-f0532d8db131&displaylang=en
for information on Windows Firewall troubleshooting
http://support.microsoft.com/default.aspx?scid=kb;en-us;875357
for more information on Windows Firewall configuration