Example: Multiple parallel processes in a job chain
The goal:
- Write a job chain that starts with the job named "truncate_export_table".
- After this job has been completed four jobs named "table partition" are to be run in parallel.
- A single job that indexes the new partition tables is then to run.
- Finally, a further four jobs that test the partition tables are to start in parallel.
This job chain is shown schematically in the diagram in the "Diamond" section below.
Writing the job chain
The following steps have to be followed to achieve a job chain that meets the requirements listed above:
- A "splitter" job has to be included for each "set" of job nodes that are to be carried out in parallel The splitter job starts the parallel jobs as soon as it itself is started.
- In order to do this the splitter job has to "know" the names of the parallel nodes, which are specified in the splitter job's state_names parameter. - !! to be done !! - : link to Node Parameter Definition wiki-artikel.
- The parallel processing normally ends at a specific node in the the chain: thereafter processing continues serially. This node is the synchronisation node and implemented using the Sync-Job.
"Diamond" diagram
The example job chain will look like this (diagram generated with the Sandbox JOE Version):
We refer to the pattern that results with this type of job chain as a "diamond" pattern. These diamonds can occur more than once in a job chain: both sequentially, as shown in the diagram above, in parallel and nested. They can also be combined with other job chain patterns such as emerald or cross-over patterns).
Job chain list view
The next illustration shows a list view of the job chain as produced by JOE:
The "Splitter" job
A generic splitter job is delivered with the JobScheduler JITL jobs. This job can be found in the "./live/sos/jitl" directory.
We recommend that you use the following syntax for the names of job nodes that are processed in parallel:
- "splitter job node name" ":" "job name" - in the example diagram above, one of the first nodes would be split_partitions:partition_1.
This allows the diagram algorithmus to know and correctly display the nodes that directly follow on from the splitter. This is because the JobScheduler syntax does not recognise predecessor relationships (only successors).
The use of the above syntax is not necessary for the correct functioning of the JobScheduler.
Splitter job parameters
- !! to be done !! - link to jobdoc.
The state_names parameter
- The splitter job state_names parameter is used to specify the node names of the jobs that are to be started in parallel (see Setting_parameters).
- The node names are to be seperated by semi-colons.
- In chains with this diamond pattern structure, the parameters are specified for the job chain and referred to as node parameters. Node parameters can be used to specify parameters for more than one splitter in a job chain, independently of one another, as in our example, without creating conflicts.
The parameters for the split_partitions splitter job - as shown in JOE - are:
The "Sync" job
A unique sync job is required at the end of every set of processes running in parallel (see - !! to be done !! - Setting_up_a_sync_job), when further nodes in the job chain after the sync node are only to be processed after all the jobs (tasks) that are to be carried out in parallel have been completed without errors.
Each sync job has to be unique within a JobScheduler instance - and within a job chain as long as a - !! to be done !! - cross-over pattern has not been implemented.
For more information see the documentation for the JobSchedulerSynchronizeJobChains job.
Best practices
Start- und End-Knoten verwenden
Wir empfehlen, in jeder Jobkette im ersten Knoten den Startjob /sos/jitl/JobChainStart und im letzten Knoten den Endjob /sos/jitl/JobChainEnd zu verwenden.
Eindeutiger Name für Sync-Job
Um den Sync-Job eindeutig zu definieren empfehlen wir, den Namen der Jobkette, in welcher der Sync-Job verwendet wird, als Präfix in dem Job-Namen des Sync-Jobs zu verwenden.
Beispiel: ideal_insert_to_export_table_parallel.export_table_build_sync
Konventionen für Knoten-Namen
Splitter-Knoten
Wir empfehlen, den Knoten-Namen eines Splitter-Jobs mit der Zeichenfolge split zu beginnen, zum Beispiel split_partitions. Damit "weiß" der Algorithmus, der das Diagramm erstellt, dass es sich um einen Splitter-Knoten handelt und kann ihn korrekt darstellen. Den Knote-Typ "Splitter" gibt es in der Syntax der Job-Knoten nicht.
Parallele Knoten
Für die Knoten-Namen der parallel zu verarbeitenden Jobs empfehlen wir die Syntax "Knoten-Name des Splitter-Jobs" ":" "Name des Jobs", zum Beispiel split_partitions:partition_1. Damit "weiß" der Diagramm-Algorithmus, welche Knoten die direkten Nachfolger, aka Vorgänger, des Splitters sind und kann dies korrekt darstellen. Die Syntax des JobScheduler kennt eine Vorgänger Beziehung nicht, deshalb die hilfsweise Darstellung über den Knoten-Namen.
Job-Knoten
Soweit möglich sollte der Name des Job-Knotens identisch sein mit dem Job-Namen (evtl. ohne Folder Namen). Wird ein Job mehrfach in einer Job-Kette verwendet, so können die Knoten-Namen durch eine angehängte Ziffer (oder Nummer) eindeutig spezifiziert werden.
Fehler-Knoten
Der Name des Fehler-Knotens sollte den Job-Namen enthalten oder identisch sein mit diesem. Damit kann bei einem Fehler im Ablauf der Jobkette sofort in JOC erkannt werden, an welcher Stelle die Jobkette abnormal beendet wurde.
Außerdem sollte der Name mit einem "!" (Ausrufezeichen, oder mit einem anderen eindeutigen Sonderzeichen) begonnen werden. Damit kann in JOC auf dem ersten Blick in der Order-Historie erkannt werden, daß die Job-Kette abnormal beendet wurde.
siehe auch
Downloads
Das verwendete Beispiel können Sie hier herunterladen insert_to_export_table_parallel.zip.