Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Content update

...

Now click on  in  the Job Menu in JOEJOC's Job Tab to unstop the job, which will take on the status pending. The next scheduled start for the order will be shown in green in the Job Chain tab.

...

This example has used the stop_on_error="yes" to blame the error on the job. The error can also be blamed on the order, which will be described in the next section.

Behavior with stop_on_error="no"

...

stop_on_error="no" is the default setting for jobs created with JOE and has the advantage that a job is not blocked for all orders if one order should fail due to a configuration error .

Suspending Orders

...

Another option in the event of an error is to suspend the order:

  • First of all, ensure that stop_on_error is set for both jobs to "no"
  • Then edit Edit the job chain configuration file simple_error_chain.job_chain.xml:
    • On
    the
    • the next job_chain_node
    next
    • add a new
    attribute
    • on_error="suspend" attribute and save
  • Run the order again
  • When the error now occurs, the order will be put back into the order queue of the second job but it will be suspended.
    This means that the order will not run again, until somebody manually

...

  • chooses  "

...

  • resume" from the order menu.
  • Fix the job - i.e. change exit 5 to exit 0
  • Choose "resume" from JOC's order menu

Retry using "setback"

Another option is to configure automatic retries using "setback":

  • First of all, ensure that stop_on_error is set for both jobs to "no"
  • Then edit the Edit the job configuration file simple_chained_job2.job.xml job configuration file
    • Put exit 5 into the job again
  • Add the following lines after the script element:

    Code Block
    <delay_order_after_setback setback_count="1" delay="20"/>
    <delay_order_after_setback setback_count="3" delay="60"/>
    <delay_order_after_setback setback_count="6" is_maximum="yes"/>
  • save Save simple_chained_job2.job.xml
  • Edit the job chain configuration file simple_error_chain.job_chain.xml
  • On the job_chain_node "next" (the node for the simple_chained_job2 job) set the on_error attribute to "setback" and save
  • Run the order again

...

The main "switch" for controlling error handling of shell jobs is the stop_on_error attribute of a job. If stop_on_error is set to yes, the job is blamed for the error and is stopped. If stop_on_error is set to no, the order is blamed for the error. For more information on stop_on_error see http://www.sos-berlin.com/doc/en/scheduler.doc/xml/job.xml#attribute_stop_on_error

By default, if an order is blamed for an error it - i.e. if stop_on_error is set to no, the order is moved to the error_state. This behavior can be changed at the job chain node with the on_error attribute. This can be set to "suspend" or "setback" and will cause the order to be either suspend or setback in the order in case of errors. When using shell jobs this only works if the job is set to event of an error.

  • Note that this will only work for shell jobs when stop_on_error="no" is set for the job.

Jobs which use the JobScheduler JobScheduler API Interface may implement more sophisticated methods to choose whether an error is blamed on the job or on the order and how to handle an erroneous ordererrors that occur in orders.