Table of Contents |
---|
Introduction
This example uses a simple job chain which starts shell jobs to demonstrate the different behaviors that can be configured for JobScheduler if an error occurs in one of the jobs.
In particular, the effect of the stop_on_error
and on_error
parameters is demonstrated along with the use of suspended orders and setbacks to retry running a job.
Downloads
- shell_error.zip - configuration files
Instructions
Behavior with stop_on_error="no"
- Unzip all files in the download into the
./config/live
folder of your JobScheduler installation. - Open the JobScheduler Operating Center, JOC, in your browser using http://scheduler_host:scheduler_port
- Open the JOB CHAINS tab and enable Show orders.
- Find the job chain
samples/shell_error/simple_error_chain
. - Find the order
simple_error_order
, open the order menu and choose Start order now.
...
The error can also be blamed on the job, which will be described in the next section.
Behavior with stop_on_error="yes"
- Edit the job configuration file
simple_chained_job2.job.xml
- If you have changed the exit code (which caused the error) to
exit 0
change it back toexit 5
to simulate an error again - Change
stop_on_error="no"
tostop_on_error="yes"
and save - Run the order again
- Look at the order history
...
This example has used the stop_on_error="yes"
to blame the error on the job.
Suspending Orders
Another option in the event of an error is to suspend the order:
- First of all, ensure that
stop_on_error
is set for both jobs to "no" - Then edit the job chain configuration file
simple_error_chain.job_chain.xml:
- On the next
job_chain_node
add a newon_error="suspend"
attribute and save
- On the next
- Run the order again
- When the error now occurs, the order will be put back into the order queue of the second job but it will be suspended.
This means that the order will not run again, until somebody manually chooses "resume" from the order menu. - Fix the job - i.e. change
exit 5
toexit 0
- Choose "resume" from JOC's order menu
Retry using "setback"
Alternative Example:
Note that we also have a dedicated example, showing the use of setbacks: How to use setbacks to make a job retry in the event an error
...
If the job is fixed during the retries, the order will go to the next_state.
How it works
The main "switch" for controlling error handling of shell jobs is the stop_on_error
attribute of a job. If stop_on_error
is set to yes, the job is blamed for the error and is stopped. If stop_on_error
is set to no, the order is blamed for the error. For more information on stop_on_error
see http://www.sos-berlin.com/doc/en/scheduler.doc/xml/job.xml#attribute_stop_on_error
...