Oracle® Real Application Clusters Administration and Deployment Guide 11g Release 2 (11.2) Part Number E10718-03 |
|
|
View PDF |
This chapter explains instance recovery and how to use Recovery Manager (RMAN) to back up and restore Oracle Real Application Clusters (Oracle RAC) databases. This chapter also describes Oracle RAC instance recovery, parallel backup, recovery with SQL*Plus, and using the Flash Recovery Area in Oracle RAC. The topics in this chapter include:
Note:
For restore and recovery in Oracle RAC environments, you do not have to configure the instance that performs the recovery to also be the sole instance that restores all of the data files. In Oracle RAC, data files are accessible from every node in the cluster, so any node can restore archived redo log files.See Also:
the Oracle Clusterware Administration and Deployment Guide. for information about backing up and restoring the Oracle Clusterware components such as the Oracle Cluster Registry (OCR) and the voting diskIn a noncluster file system environment, each node can back up only its own local archived redo log files. For example, node 1 cannot access the archived redo log files on node 2 or node 3 unless you configure the network file system for remote access. If you configure a network file system file for backups, then each node will back up its archived redo logs to a local directory.
This section describes the following common RMAN restore scenarios:
Using RMAN or Oracle Enterprise Manager to Restore the Server Parameter File (SPFILE)
Note:
The restore and recovery procedures in a cluster file system scheme do not differ substantially from Oracle single-instance scenarios.The scheme that this section describes assumes that you are using the "Oracle Automatic Storage Management and Cluster File System Archiving Scheme". In this scheme, assume that node 3 performed the backups to a CFS. If node 3 is available for the restore and recovery operation, and if all of the archived logs have been backed up or are on disk, then run the following commands to perform complete recovery:
RESTORE DATABASE; RECOVER DATABASE;
If node 3 performed the backups but is unavailable, then configure a media management device for one of the remaining nodes and make the backup media from node 3 available to this node.
The scheme that this section describes assumes that you are using the "Noncluster File System Local Archiving Scheme". In this scheme, each node archives locally to a different directory. For example, node 1 archives to /arc_dest_1
, node 2 archives to /arc_dest_2
, and node 3 archives to /arc_dest_3
. You must configure a network file system file so that the recovery node can read the archiving directories on the remaining nodes.
If all nodes are available and if all archived redo logs have been backed up, then you can perform a complete restore and recovery by mounting the database and running the following commands from any node:
RESTORE DATABASE; RECOVER DATABASE;
Because the network file system configuration enables each node read access to the other nodes, then the recovery node can read and apply the archived redo logs located on the local and remote disks. No manual transfer of archived redo logs is required.
RMAN can restore the server parameter file either to the default location or to a location that you specify.
You can also use Oracle Enterprise Manager to restore SPFILE. From the Backup/Recovery section of the Maintenance tab, click Perform Recovery. The Perform Recovery link is context-sensitive and navigates you to the SPFILE restore only when the database is closed.
Instance failure occurs when software or hardware problems disable an instance. After instance failure, Oracle Database automatically uses the online redo logs to perform recovery as described in this section.
Instance recovery in Oracle RAC does not include the recovery of applications that were running on the failed instance. Oracle Clusterware restarts the instance automatically. You can also use callout programs as described in the example on Oracle Technology Network (OTN) to trigger application recovery.
Applications that were running continue by using failure recognition and recovery. This provides consistent and uninterrupted service in the event of hardware or software failures. When one instance performs recovery for another instance, the surviving instance reads online redo logs generated by the failed instance and uses that information to ensure that committed transactions are recorded in the database. Thus, data from committed transactions is not lost. The instance performing recovery rolls back transactions that were active at the time of the failure and releases resources used by those transactions.
Note:
All online redo logs must be accessible for instance recovery. Therefore, Oracle recommends that you mirror your online redo logs.When multiple node failures occur, as long as one instance survives, Oracle RAC performs instance recovery for any other instances that fail. If all instances of an Oracle RAC database fail, then Oracle Database automatically recovers the instances the next time one instance opens the database. The instance performing recovery can mount the database in either cluster database or exclusive mode from any node of an Oracle RAC database. This recovery procedure is the same for Oracle Database running in shared mode as it is for Oracle Database running in exclusive mode, except that one instance performs instance recovery for all of the failed instances.
Oracle Database provides RMAN for backing up and restoring the database. RMAN enables you to back up, restore, and recover data files, control files, SPFILEs, and archived redo logs. RMAN is included with the Oracle Database server and it is installed by default. You can run RMAN from the command line or you can use it from the Backup Manager in Oracle Enterprise Manager. In addition, RMAN is the recommended backup and recovery tool if you are using Oracle Automatic Storage Management (Oracle ASM).
The procedures for using RMAN in Oracle RAC environments do not differ substantially from those for Oracle single-instance environments. See the Oracle Backup and Recovery documentation set for more information about single-instance RMAN backup procedures.
Channel connections to the instances are determined using the connect string defined by channel configurations. For example, in the following configuration, three channels are allocated using user1/pwd1@
service_name
. If you configure the SQL Net service name with load balancing turned on, then the channels are allocated at a node as decided by the load balancing algorithm.
CONFIGURE DEVICE TYPE sbt PARALLELISM 3;
CONFIGURE DEFAULT DEVICE TYPE TO sbt;
CONFIGURE CHANNEL DEVICE TYPE SBT CONNECT 'user1/pwd1@service_name'
However, if the service name used in the connect string is not for load balancing, then you can control at which instance the channels are allocated using separate connect strings for each channel configuration as follows:
CONFIGURE DEVICE TYPE sbt PARALLELISM 3; CONFIGURE CHANNEL 1.. CONNECT 'user1/pwd1@node1'; CONFIGURE CHANNEL 2.. CONNECT 'user2/pwd2@node2'; CONFIGURE CHANNEL 3.. CONNECT 'user3/pwd3@node3';
In the previous example, it is assumed that node1
, node2
and node3
are SQL Net service names that connect to pre-defined nodes in your Oracle RAC environment. Alternatively, you can also use manually allocated channels to backup your database files. For example, the following command backs up the SPFILE, controlfile, data files and archived redo logs:
RUN { ALLOCATE CHANNEL CH1 CONNECT 'user1/pwd1@node1'; ALLOCATE CHANNEL CH2 CONNECT 'user2/pwd2@node2'; ALLOCATE CHANNEL CH3 CONNECT 'user3/pwd3@node3'; BACKUP DATABASE PLUS ARCHIVED LOG; }
During a backup operation, as long as at least one of the channels allocated has access to the archived log, RMAN automatically schedules the backup of the specific log on that channel. Because the control file, SPFILE, and data files are accessible by any channel, the backup operation of these files is distributed across the allocated channels.
For a local archiving scheme, there must be at least one channel allocated to all of the nodes that write to their local archived logs. For a CFS archiving scheme, assuming that every node writes to the archived logs in the same CFS, the backup operation of the archived logs is distributed across the allocated channels.
During a backup, the instances to which the channels connect must be either all mounted or all open. For example, if the node1 instance has the database mounted while the node2 and node3 instances have the database open, then the backup fails.
In some cluster database configurations, some nodes of the cluster have faster access to certain data files than to other data files. RMAN automatically detects this, which is known as node affinity awareness. When deciding which channel to use to back up a particular data file, RMAN gives preference to the nodes with faster access to the data files that you want to back up. For example, if you have a three-node cluster, and if node 1 has faster read/write access to data files 7, 8, and 9 than the other nodes, then node 1 has greater node affinity to those files than nodes 2 and 3.
See Also:
Oracle Database Backup and Recovery Reference for more information about theCONNECT
clause of the CONFIGURE CHANNEL
statementAssuming that you have configured the automatic channels as defined in section "Channel Connections to Cluster Instances", you can use the following example to delete the archived logs that you backed up n
times. The device type can be DISK
or SBT
:
DELETE ARCHIVELOG ALL BACKED UP n TIMES TO DEVICE TYPE device_type;
During a delete operation, as long as at least one of the channels allocated has access to the archived log, RMAN will automatically schedule the deletion of the specific log on that channel. For a local archiving scheme, there must be at least one channel allocated that can delete an archived log. For a CFS archiving scheme, assuming that every node writes to the archived logs on the same CFS, the archived log can be deleted by any allocated channel.
If you have not configured automatic channels, then you can manually allocate the maintenance channels as follows and delete the archived logs.
ALLOCATE CHANNEL FOR MAINTENANCE DEVICE TYPE DISK CONNECT 'SYS/oracle@node1'; ALLOCATE CHANNEL FOR MAINTENANCE DEVICE TYPE DISK CONNECT 'SYS/oracle@node2'; ALLOCATE CHANNEL FOR MAINTENANCE DEVICE TYPE DISK CONNECT 'SYS/oracle@node3'; DELETE ARCHIVELOG ALL BACKED UP n TIMES TO DEVICE TYPE device_type;
RMAN automatically performs autolocation of all files that it needs to back up or restore. If you use the noncluster file system local archiving scheme, then a node can only read the archived redo logs that were generated by an instance on that node. RMAN never attempts to back up archived redo logs on a channel it cannot read.
During a restore operation, RMAN automatically performs the autolocation of backups. A channel connected to a specific node only attempts to restore files that were backed up to the node. For example, assume that log sequence 1001 is backed up to the drive attached to node 1, while log 1002 is backed up to the drive attached to node 2. If you then allocate channels that connect to each node, then the channel connected to node 1 can restore log 1001 (but not 1002), and the channel connected to node 2 can restore log 1002 (but not 1001).
Media recovery must be user-initiated through a client application, whereas instance recovery is automatically performed by the database. In these situations, use RMAN to restore backups of the data files and then recover the database. The procedures for RMAN media recovery in Oracle RAC environments do not differ substantially from the media recovery procedures for single-instance environments.
The node that performs the recovery must be able to restore all of the required data files. That node must also be able to either read all of the required archived redo logs on disk or be able to restore them from backups.
When recovering a database with encrypted tablespaces (for example after a SHUTDOWN ABORT
or a catastrophic error that brings down the database instance), you must open the Oracle Wallet after database mount and before you open the database, so the recovery process can decrypt data blocks and redo.
Oracle Database automatically selects the optimum degree of parallelism for instance, crash, and media recovery. Oracle Database applies archived redo logs using an optimal number of parallel processes based on the availability of CPUs. You can use parallel instance recovery and parallel media recovery in Oracle RAC databases as described under the following topics:
See Also:
Oracle Database Backup and Recovery User's Guide for more information on these topicsWith RMAN's RESTORE
and RECOVER
commands, Oracle Database automatically makes parallel the following three stages of recovery:
Restoring Data Files When restoring data files, the number of channels you allocate in the RMAN recover script effectively sets the parallelism that RMAN uses. For example, if you allocate five channels, you can have up to five parallel streams restoring data files.
Applying Incremental Backups Similarly, when you are applying incremental backups, the number of channels you allocate determines the potential parallelism.
Applying Archived Redo Logs With RMAN, the application of archived redo logs is performed in parallel. Oracle Database automatically selects the optimum degree of parallelism based on available CPU resources.
You can override this using the procedures under the following topics:
To disable parallel instance and crash recovery on a system with multiple CPUs, set the
RECOVERY_PARALLELISM
parameter to 0.
Use the NOPARALLEL
clause of the RMAN RECOVER
command or the ALTER DATABASE RECOVER
statement to force Oracle Database to use non-parallel media recovery.
To use a flash recovery area in Oracle RAC, you must place it on an Oracle ASM disk group, a Cluster File System, or on a shared directory that is configured through a network file system file for each Oracle RAC instance. In other words, the flash recovery area must be shared among all of the instances of an Oracle RAC database. In addition, set the parameter DB_RECOVERY_FILE_DEST
to the same value on all instances.
Oracle Enterprise Manager enables you to set up a flash recovery area. To use this feature:
From the Cluster Database home page, click the Maintenance tab.
Under the Backup/Recovery options list, click Configure Recovery Settings.
Specify your requirements in the Flash Recovery Area section of the page.
Click Help on this page for more information.