Oracle® Database Administrator's Guide 11g Release 2 (11.2) Part Number E10595-04 |
|
|
View PDF |
This section contains the following troubleshooting topics:
A job may fail to run for several reasons. To begin troubleshooting a job that you suspect did not run, check the job state by issuing the following statement:
SELECT JOB_NAME, STATE FROM DBA_SCHEDULER_JOBS;
Typical output will resemble the following:
JOB_NAME STATE ------------------------------ --------- MY_EMP_JOB DISABLED MY_EMP_JOB1 FAILED MY_NEW_JOB1 DISABLED MY_NEW_JOB2 BROKEN MY_NEW_JOB3 COMPLETED
There a four states that a job could be in if it does not run:
If a job has the status of FAILED
in the job table, it was scheduled to run once but the execution has failed. If the job was specified as restartable, all retries have failed.
If a job fails in the middle of execution, only the last transaction of that job is rolled back. If your job executes multiple transactions, you need to be careful about setting restartable
to TRUE
. You can query failed jobs by querying the *_SCHEDULER_JOB_RUN_DETAILS
views.
A broken job is one that has exceeded a certain number of failures. This number is set in max_failures
, and can be altered. In the case of a broken job, the entire job is broken, and it will not be run until it has been fixed. For debugging and testing, you can use the RUN_JOB
procedure.
You can query broken jobs by querying the *_SCHEDULER_JOBS
and *_SCHEDULER_JOB_LOG
views.
A job can become disabled for the following reasons:
The job was manually disabled
The job class it belongs to was dropped
The program, chain, or schedule that it points to was dropped
A window or window group is its schedule and the window or window group is dropped
A job will be completed if end_date
or max_runs
is reached. (If a job recently completed successfully but is scheduled to run again, the job state is SCHEDULED
.)
An important troubleshooting tool is the job log. For details and instructions, see "Viewing the Job Log".
Remote jobs must successfully communicate with a Scheduler agent on the remote host. If a remote job does not run, check the DBA_SCHEDULER_JOBS
view and the job log first. Then perform the following tasks:
Check that the remote system is reachable over the network with tools such as nslookup
and ping
.
Check the status of the Scheduler agent on the remote host by calling the GET_AGENT_VERSION
package procedure.
DECLARE versionnum VARCHAR2(30); BEGIN versionnum := DBMS_SCHEDULER.GET_AGENT_VERSION('remote_host.example.com'); DBMS_OUTPUT.PUT_LINE(versionnum); END; /
If an error is generated, the agent may not be installed or may not be registered with your local database. See "Enabling and Disabling Remote Jobs" for instructions for installing, registering, and starting the Scheduler agent.
The Scheduler attempts to recover jobs that are interrupted when:
The database abnormally shuts down
A job slave process is killed or otherwise fails
For an external job, the external job process that starts the executable or script is killed or otherwise fails. (The external job process is extjob
on Unix. On Windows, it is the external job service.)
For an external job, the process that runs the end-user executable or script is killed or otherwise fails.
Job recovery proceeds as follows:
The Scheduler adds an entry to the job log for the instance of the job that was running when the failure occurred. In the log entry, the OPERATION
is 'RUN
', the STATUS
is 'STOPPED
', and ADDITIONAL_INFO
contains one of the following:
REASON="Job slave process was terminated"
REASON="ORA-01014: ORACLE shutdown in progress"
If restartable
is set to TRUE
for the job, the job is restarted.
If restartable
is set to FALSE
for the job:
If the job is a run-once job and auto_drop
is set to TRUE
, the job run is done and the job is dropped.
If the job is a run-once job and auto_drop
is set to FALSE
, the job is disabled and the job state
is set to 'STOPPED
'.
If the job is a repeating job, the Scheduler schedules the next job run and the job state
is set to 'SCHEDULED
'.
When a job is restarted as a result of this recovery process, the new run is entered into the job log with the operation 'RECOVERY_RUN
'.
A program can become disabled if a program argument is dropped or number_of_arguments
is changed so that all arguments are no longer defined.
See "Creating and Managing Programs to Define Jobs" for more information regarding programs.
A window can fail to take effect for the following reasons:
A window becomes disabled when it is at the end of its schedule
A window that points to a schedule that no longer exists is disabled
See "Managing Job Scheduling and Job Priorities with Windows" for more information regarding windows.