Table of Contents | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Example: Multiple parallel processes in a job chain
The goal:
- Write a job chain
...
- Please define some nodes at the beginning of the chain
- When you want to parallize the execution add a node with the splitter job
- Define the next node for the splitter. This could be the sync node or any other node in the chain.
- Define the number of orders that should run in parallel. Also define the node where these order should start. Use a list like 100;200;300. That means that you will get 3 additional orders all running in paralell.
- Define the sync node with the JITL sync job. All orders must reach this node. When all orders are in this node, the execution proceeds with the original order. The new orders will end in the sync.
- please note that every parallel processing needs an own unique sync job.
You can download an example from here.Splitter.zip
Code Block |
---|
*****under construction*****
|
Best Practices for job chains with parallel processing
Example: Multiple parallel processing in a single job chain
*
Beispiel: Wir wollen eine Jobkette erstellen die zunächst 6 insert table jobs parallel ausführt. Wenn alle 6 Jobs fertig sind soll ein create Index Job gestartet werden. Danach sollen mehrere test Jobs über die erstellten Daten laufen. Die tes Jobs sollen wieder parallel ausgeführt werden.
Case: We like to create a job chain which runs six insert table jobs in parallel.
After processing those 6 jobs we want to process a create index job.
Then we want to run several test jobs on the data. The processing of the test jobs should be parallel.
*
Diagram der Jobkette (erstellt mit einer experimentellen JOE-Version, die zukünftig ausgeliefert wird).
Diagram created with JOE (JobScheduler object editor - this is a sandbox, coming with one of the next versions)
*
Das nächste Bild zeigt eine Liste der Jobs der Jobkette in der Darstellung in JOE
This picture shows the jobs of the chain displayed by JOE Steps/Nodes
- Verwende für jede parallele Verarbeitung einen split Job einen dazugehörenden eindeutigen sync Job.
Dieser sync Job darf in einer JobScheduler Instanz in keiner anderen Jobkette verwendet werden.
Um den sync Job eindeutig zu halten empfehlen wir den Namen der Jobkette als Prefix in dem Jobnamen des sync Jobs zu verwenden.
Beispiel: insert_to_core_parallel.sync1
- Verwende für jede parallele Verarbeitung einen split Job einen dazugehörenden eindeutigen sync Job.
- Für die Namensgebung der State Namen für die parallel zu verarbeitenden Knoten empfehlen wir die Syntax state Name Splitter : state Name Job
Beispiel: split:partition_1
- Use a split job to start parallel processing with a JS instance. There has to be a corresponding sync job for each split job. This sync job has to be unique (i.e. have a unique name) and cannot be used anywhere else in the JS instance.
- Use an unique sync job for each parallel processing within one JobScheduler instance. One sync job evaluates the information from all splitter jobs where this sync job is used.
- To identify the sync jobs in JOE job list and to make sure to use a unique sync job name we strongly recommend to combine the synchronization job name with the name of the job chain using the sync job.
- We recommend to combine the parameter state_names in the splitter job as node name of splitter:jobname
Display detail from JOE showing the parameters for the first splitter job split
Display detail from JOE showing the parameters for the second splitter job split_4_test
we also recommend to use the startjob /sos/jitl/JobChainStart and the endjob /sos/jitl/JobChainEnd in {}every{*} job chain.
- that starts with the job named truncate_export_table.
- After this job has been completed four jobs named table partition are to be executed in parallel.
- A single job that indexes the new partition tables is then to run.
- Finally, a further four jobs that test the partition tables are to start in parallel.
This job chain is shown schematically in the diagram in the Diamond section below.
Writing the job chain
The following steps have to be followed to achieve a job chain that meets the requirements listed above:
- A "splitter" job has to be included for each "set" of job nodes that are to be processed in parallel. The splitter job starts the parallel jobs as soon as it itself is started.
- In order to do this the splitter job has to "know" the names of the parallel nodes, which are specified in the splitter job's state_names parameter (see How to set and read job and order parameters).
- The parallel processing normally ends at a specific node in the the chain: thereafter processing continues serially. This node is the synchronization node and implemented using the Sync-Job.
Diamond Diagram
The example job chain will look like this (diagram generated with the Sandbox JOE Version):
We refer to the pattern that results with this type of job chain as a "diamond" pattern. These diamonds can occur more than once in a job chain: both sequentially, as shown in the diagram above, in parallel and nested. They can also be combined with other job chain patterns such as emerald or cross-over patterns (see Example showing the synchronization of multiple job chains).
Job chain list view
The next illustration shows a list view of the job chain as produced by JOE:
The "Splitter" job
A generic splitter job is delivered with the JobScheduler JITL Jobs. This job can be found in the ./live/sos/jitl directory.
We recommend that you use the following syntax for the names of job nodes that are processed in parallel:
- "splitter job node name" ":" "job name". In the example diagram above, one of the first nodes would then have the name split_partitions:partition_1.
This syntax allows the diagram algorithm in JOE (that was used to draw the "diamond" diagram shown above) to know and to correctly display the nodes that directly follow on from the splitter. It is necessary to use this type of job name syntax as the syntax used by JobScheduler does not recognize predecessor relationships (only successors).
The use of the above syntax is not necessary for the correct functioning of the JobScheduler.
Splitter job parameters
See documentation of job JobChainSplitter.
The state_names parameter
- The splitter job state_names parameter is used to specify the node names of the jobs that are to be started in parallel (see How to set and read job and order parameters).
- The node names are to be separated by semi-colons.
- In job chains with this diamond pattern structure, the parameters are specified for the job chain and referred to as node parameters. Node parameters can be used to specify parameters for more than one splitter in a job chain, independently of one another, as in our example, without creating conflicts.
The parameters for the split_partitions splitter job - as shown in JOE - are:
The "Sync" job
A unique sync job is required at the end of every set of processes running in parallel (see Example showing how to set up a sync job), when further nodes in the job chain after the sync node are only to be processed after all the jobs (tasks) that are to be carried out in parallel have been completed without errors.
Each sync job has to be unique within a JobScheduler instance - and within a job chain as long as a cross-over pattern has not been implemented (see Example showing the synchronization of multiple job chains).
For more information see the documentation for the JobSchedulerSynchronizeJobChains job.
Best Practice
Use "Start" and "End" nodes:
We recommend that you use our /sos/jitl/JobChainStart start job as the first node in every job chain and our /sos/jitl/JobChainEnd end job as the last full node.
Assign each sync job a unique name:
Assign each sync job a unique name by using the name of the job chain in which the sync job is included as a prefix in the name of the sync job.
- For example: ideal_insert_to_export_table_parallel.sync_partitions
Follow our convention for node naming:
Splitter Nodes
We recommend that you use the following syntax for the names of job nodes that are processed in parallel:
- "splitter job node name" ":" "job name". In the example diagram above, one of the first nodes would then have the name split_partitions:partition_1.
This syntax allows the diagram algorithm in JOE to draw job chain diagrams and correctly display the nodes that directly follow on from the splitter. It is necessary to use this type of job name syntax for the algorithm as the syntax used by JobScheduler does not recognize predecessor relationships (only successors).
Parallel Nodes
We recommend that you use the following syntax for the names of job nodes that are processed in parallel:
"splitter job node name" ":" "job name".
In the example described above, one of the first nodes would then be split_partitions:export_table_partition_1.
This allows the diagram algorithms in JOE to know and correctly display the nodes that directly follow on from the splitter. This is because the JobScheduler syntax does not recognize predecessor relationships (only successors).
Job Nodes
As far as possible, the names of job nodes should be identical to the job names (poss. without the folder name). If a job is used more than once in a job chain, then the node name can be uniquely specified using a letter or number as a suffix.
Error Nodes
The name of the error node should either contain the job name or be identical with it. This means that in the event of an processing error in the job chain, it is possible to see immediately in JOC the point in the job chain where the abnormal termination occurred.
In addition, the name should start with an "!" (an exclamation point, or with another unique special character). This makes it easier to see in the order history in JOC that the job chain has terminated abnormally.
See also:
- JobSchedulerSynchronizeJobChains.
- Examples for using Sync-Jobs see Examples - Jobs, Job Chains and Orders
Downloads
You can download the example described in this FAQ : insert_to_export_table_parallel.zip.You can download the example from here.MultiParallel.zip