Scope
- Documentation of the dependency handling
- Usage and limitations
Starting Point
- JobScheduler implements the chaining of jobs by a look-ahead mechanism:
- Each job node in a job chain has two successors:
- one for successful execution and
- one for failed execution (based on the checking of exit codes).
- The point in time when a job chain should start is triggered by an order that determines the run-time.
- Each job node in a job chain has two successors:
This look-ahead mechanism is fine for customers who are able to re-design their job chains in a way that fits the checking of forward dependencies.
This mechanism is not sufficient when it comes to the checking of backward dependencies and dependencies across job chains.
- SOS provides a solution for dependency handling that is aimed at customers who start from legacy job scheduling products and have to implement backward dependencies.
What it does?
Job Network Dependencies
- Dependency handling implements a view on a sequence of job chains as a job network for a specific period.
- Dependencies include predecessors (backward dependency) and successors (forward dependency).
- The job network is implemented by orders, the underlying job chains remain untouched.
- Each job chain in a job network requires an order.
- Such orders create backward and forward depencies to job chains.
- Orders with the same identification are part of the same job network.
- Multiple job networks are feasible using orders with identical identification across multiple job chains.
- Job networks are considered for the period of one day, i.e. depdendencies are calculated from the begin of a period.
- Job networks will temporarily skip the execution of individual job chains that are deactivated, i.e. stopped.
- The start of job network nodes is determined by timeslots, by dependencies and by start times within periods of the respective orders.
- First Precedence: Timeslots
- The timeslot determines the slot in which an order might be executed, e.g. during specific hours per workday.
- Timeslots block the execution of orders outside of the given slot.
- Second Precedence: Dependencies
- A backward dependency has precedence on the start time, i.e. an order will be executed only if its predecessors within the same period have completed successfully.
- A forward dependency has precedence on the start time, i.e. successor orders will be started immediately if enqueued for the same period.
- Third Precedence: Start Times
- The weakest precedence is specified by the point in time that the execution of an order is scheduled for.
- The start time becomes effective if no predecessor dependency exists or if predecessor dependencies have been resolved.
- Orders without start time will be executed based on predecessor dependencies only.
- First Precedence: Timeslots
- Different impacts of dependencies are to be distinguished:
- Deep & Shallow Dependencies
- deep: if a predecessor job chain is stopped then its predecessor dependencies will be added to the dependencies of the current order.
- shallow: if a predecessor job chain is stopped then it is removed from the dependencies of the current order.
- Repeated Execution
- Execution at Job Network Level
- disabled: the job network can be executed once per period.
- enabled: the job network can be executed multiple times per period.
- Execution at Order Level
- disabled: repeated execution of individual orders will be suppressed.
- enabled: repeated execution of individual orders is enabled should their predecessor dependencies have been resolved once in a period.
- Execution at Job Network Level
- Deep & Shallow Dependencies
- The current implementation of job network dependencies considers shallow dependencies with repeated execution being enabled for the job network, not for orders.
Flow
How it works?
- Dependency checking is done at the beginning and at the end of a job chain.
- At the beginning of a job chain the pre-requisites are checked if the job chain should be executed or not.
- At the end of a job chain possible successor job chains are identified and will be started if applicable.
- These dependency checks are available as standard jobs, see Dependency Handling - Components for detail:
jobnet_check_predecessor
: at the beginning of a job chain.jobnet_check_successor
: at the end of job chain.
- The same jobs can be re-used for all job chains in an environment. There is no need for individual implementation of these jobs.
Predecessor Handling
- Predecessors are any job chains that are executed as a pre-requisite before the current job chain is allowed to execute.
- This translates to the fact that the current job chain depends on its predecessors, i.e. it should not be executed if one of its predecessors has not previously successfully completed its execution.
Job: jobnet_check_predecessor
- This job is added to the first job step in a job chain.
- The according job node state has to be assigned the value j
obnet_start
. - This job uses parameters from the order that triggers a job chain. Without parameters no dependency checking is applied and the job chain executes unconditionally.
- A job node state
jobnet_skip
is required as an end state in the job chain. The predecessor checking wll move an order to that end state to prevent execution of the current job chain should the pre-requisites not be fulfilled.
Order: jobnet_order
Definition
For each job chain an order is required.
Even in cases when a job chain is never started autonomously but is always triggered by its predecessor job chains then an order is required.
Should no autonomous execution of a job chain be intended then the order is not associated a run-time, i.e. it will never start autonomously.
The name of the order should be
jobnet_order
, the title for the order can be chosen freely.
Parameterization
To enable dependency checking the following parameter is added to the order:
Name:
jobnet_predecessor
Value: a comma separated list of job chains that are acting as predecessors of the current job chain.
Explanation:
- Job chains in this list include the full path to the respective job chain starting from the
live
folder. - Should job chains be renamed or moved to a different location then this parameter value has to be adjusted accordingly.
- Job chains in this list include the full path to the respective job chain starting from the
Sample:
jobnet/examples/JobChain_B/job_chain_B
specifies a job chain with the namejob_chain_B
that is located in a folderjobnet/examples/JobChain_B
Run-Time Behaviour
- At run-time the following checks are performed in the given precedence for predecessor job chains that are specified by the order parameter
jobnet_predecessor
:- Check for active predecessor job chains that might prevent the execution of the current job chain.
- If a predecessor job chain were not active, i.e.
stopped
, then that job chain would not be considered as a pre-requisite for execution of the current job chain. - Only active predecessor job chains are considered a pre-requisite.
- If a predecessor job chain were not active, i.e.
- Check if orders of predecessor job chains have been executed earlier the same day and were completed successfully or synchronized as this would be considered a positive fulfillment of pre-requisites.
- If an order for a predecessor job chain has been executed earlier the same day and was successfully completed or suspended due to synchronization then this is considered a positive fulfillment of pre-requisites.
- An order of a predecessor job chain that previously skipped execution has to be associated the job node state j
obnet_skip
. - An order of a predecessor job chain that currently is being synchronized with other job chains has to be associated the job node state j
obnet_sync
.
- An order of a predecessor job chain that previously skipped execution has to be associated the job node state j
- The current job chain might implement a synchronization node with a predecessor job chain and therefore its processing will be continued to complete the synchronization.
- If an order for a predecessor job chain has been executed earlier the same day and was successfully completed or suspended due to synchronization then this is considered a positive fulfillment of pre-requisites.
- Check if running orders of predecessor job chains are available that indicate a negative fulfillment of pre-requisites.
- Running orders would exist e.g. if an order is currently being executed, if it is suspended or if it runs in a setback loop. In any of these cases the order has not successfully completed, therefore running orders are considered a negative fulfillment of pre-requisites.
- Check if enqueued orders of predecessor job chains are available for today that are considered a negative fulfillment of pre-requisites.
- Should later execution of an order be planned then the current job chain should not be executed before completion of that order.
- Check for active predecessor job chains that might prevent the execution of the current job chain.
- As a result of that checking the order for the current job chain
- proceeds if no negative fulfillment of pre-requisites were detected,
- will be moved to the job node state
jobnet_skip
at the end of the job chain if any negative fulfillment of pre-requisites were detected.
Successor Handling
- Successors are any job chains that should be executed after the current job chain.
- This translates to the fact that the successor job chains depend on the current job chain, i.e. the successor job chains should be executed after the current job chain successfully completed its execution.
Job: jobnet_check_successor
- This job is added to the last job step in a job chain.
- The according job node state has to be assigned the value j
obnet_end
. - This job uses parameters from the order that triggers a job chain. Without parameters no dependency checking is applied and no successor job chains will be executed.
- see above screenshots for the job
jobnet_check_predecessor
that show the position of this job in a job chain.
Order: jobnet_order
Definition
The same order of the current job chain as for predecessor checking is used for successor checking.
The name of the order should be
jobnet_order
, the title for the order can be chosen freely.
Parameterization
To enable dependency checking the following parameter is added to the order:
Name:
jobnet_successor
Value: a comma separated list of job chains that are acting as successors of the current job chain.
Explanation:
- Job chains in this list include the full path to the respective job chain starting from the
live
folder. - When job chains are renamed or moved to a different location then this parameter value has to be adjusted accordingly.
- see above screenshot for the handling of orders for predecessor checking that shows a parameterization sample.
- Job chains in this list include the full path to the respective job chain starting from the
Sample:
jobnet/examples/JobChain_I/job_chain_I
specifies a job chain with the namejob_chain_I
that is located in a folderjobnet/examples/JobChain_I
Run-Time Behaviour
- At run-time the following checks are performed in the given precedence for successor job chains that are specified by the order parameter
jobnet_successor
:- Check dependencies for active successor job chains that should be started.
- If a successor job chain were not active, i.e.
stopped
, then that job chain would not be considered a candidate for execution. - Only active successor job chains are considered a candidate.
- If a successor job chain were not active, i.e.
- Check for running orders of successor job chains that are considered a negative pre-requisites for starting successor job chains.
- If an order is currently being executed by a successor job chain then no additional order should be launched.
- Running orders include orders that are executed, suspended or are running in a setback loop, e.g. for synchronization.
- Check for enqueued orders within the current period that are considered safe for anticipated execution.
- Such orders will be started immediately.
- Check for orders that have been scheduled for an earlier time in the current period and that have been skipped and are therefore safe for execution.
- Should an order of a successor job chain have been skipped, e.g. due to its predecessor checking of the current job chain, then this order will be started immediately.
- Check dependencies for active successor job chains that should be started.
Usage for Synchronization
Applicable Jobs
- Jobs that are based on the JITL class
com.sos.jitl.sync.JobSchedulerSynchronizeJobChainsJSAdapterClass
are used to synchronize orders across multiple job chains. - Such jobs should
- use the Monitor Script
jobnet_check_successor.js
in order to work consistently for dependency checking. - associate the job node state
jobnet_sync
to the synchronization job.
- use the Monitor Script
Sample Synchronization Job
A synchronization job should add a reference to the Monitor Script for successor checking like this:
Run-Time Behaviour
- A synchronized order is processed by the synchronization job in its job chain. That order will be suspended as long as no complementary orders from other job chains were available for synchronization jobs with the same job name.
- Successor checking is performed by the Monitor Script and is executed for all synchronized orders. Therefore due to successor checking orders will possibly be started for successor job chains that might be required to complete the synchronization of the current order.
Components
For a detailed explanation see Dependency Handling - Components
Settings
- Dependency checking can be configured by settings.
- Such settings can be applied for all job chains or for individual jobs and orders.
Where to configure settings?
- Such settings can be applied at the following levels with ascending precedence:
The configuration in
$SCHEDULER_DATA/config/scheduler.xml
is effected by adding a parameter like this:Configuration in file scheduler.xml<spooler> <config> <param name="jobnet.predecessor.force_start_for_orders_in_period" value="false"/> <param name="jobnet.successor.force_start_for_orders_in_period" value="true"/> ...
The configuration at job level is effected by adding a parameter like this:
Configuration by job parameter<job order="yes"> <params > <param name="jobnet.predecessor.force_start_for_orders_in_period" value="true"/> <param name="jobnet.successor.force_start_for_orders_in_period" value="true"/> </params> ...
The configuration at order level is effected by adding a parameter like this:
Configuration by order parameter<order> <params > <param name="jobnet.predecessor.force_start_for_orders_in_period" value="true"/> <param name="jobnet.successor.force_start_for_orders_in_period" value="true"/> </params> ...
What settings can be configured?
Setting | Default Value | Scope | Description |
---|---|---|---|
jobnet.predecessor.force_start_for_orders_in_period | false | predecessor checking | A value true will force the current oder to execute even if in a predecessor job chain orders are found that are scheduled for the current period. |
jobnet.successor.force_start_for_orders_in_period | true | successor checking | A value false will suppress the execution of orders in successor job chains even if that orders where scheduled for the current period. |
What is missing?
- Testing
- The described behaviour is currently being tested.
- Exceeded Periods
- Currently a period spans one day.
- Due to manual intervention or due to operational problems a period could last longer than one day, e.g. if jobs were delayed and would be executed over midnight or started after midnight.
- Multiple Job Networks
- Awareness for multiple job networks is required.
- Repeated Execution
- Repeated execution should be enabled/disabled by configuration
- Active Cluster Support
- This feature relies on the fact that job chains can be stopped to prevent them from being considered as active predecessors. Currently this is not feasible for cluster operation with distributed job chains. The following issues address this limitation and should be resolved with version 1.8:
- This feature relies on the fact that job chains can be stopped to prevent them from being considered as active predecessors. Currently this is not feasible for cluster operation with distributed job chains. The following issues address this limitation and should be resolved with version 1.8:
- Orders without Run-Time
- Such orders are required to enable the start of an order by the dependency checking Monitor Scripts.
- Currently orders without run-time are not considered. Instead, such orders should be identified as empty orders that can be used by the Monitor Scripts to start successor job chains.