Introduction
- Users might be interested in specifying jobs and dependencies from Microsoft Excel® spreadsheets.
- In a situation when business staff is in charge of specifying jobs then Excel® is a frequently used tool.
- As a no-code approach this allows to specify jobs and dependencies by users without detailed know-how of JS7 - Workflows.
- The solution explained from the article creates jobs and workflows from Excel® spreadsheets ready for import into JS7.
- The solution makes use of a PowerShell cmdlet for Linux and Windows and can be operated for PowerShell 5.1, 6, 7.
- The solution works for .xlsx files. Files from formats such as .xls and .csv have to be imported to Excel®.
- The solution is both a tool for simple use cases and a programming example how to create workflows from Excel®. More elaborate approaches for processing of customized Excel® input are subject to SOS professional services.
Prerequisites
The following PowerShell modules are used for this purpose:
- ImportExcel PowerShell Module (3rd party) to read Excel® spreadsheets on Linux and Windows.
- Optionally the JS7 PowerShell Module is used to access the JS7 - REST Web Service API for automated import and deployment.
- If the module is not used then users can manually import resulting workflows using the JOC Cockpit user interface.
Download
The following files are used for the solution:
- Sample Excel® spreadsheet file: jobs.xlsx
- PowerShell cmdlet (verify signed scripts): New-JS7WorkflowFromExcel.ps1
Getting Started
Specifying Jobs and Dependencies with Excel®
Download the jobs.xlsx Excel® spreadsheet file.
- The worksheets included with the spreadsheet file offer a number of jobs and dependencies from the following columns:
NodeID
: A unique identifier for the occurrence of a job. If the same job occurs a number of times then theName
column holds the same job name, however, a unique value has to be specified for theNodeID
column. NodeIDs have to start with a character and must not include spaces.SuccessorID
: Specifies the NodeIDs of jobs that are direct successors of the current job. If more than one job is specified then they will will be executed in parallel branches. A number of NodeIDs in this column are separated by a space.Instruction
: Makes use of the fixed value Job. This column is reserved for later support of further JS7 - Workflow Instructions. This column can be missing or hold empty values.Name
: Specifies the job name. This column is required and cannot hold empty values. Job names have to start with a character and must not include spaces. The same job name can be used in a number of rows indicating that the job is re-used for each occurrence with a uniqueNodeID
.Description
: Optionally specifies a descriptive title for the job. This column can be missing or hold empty values.Argument
: Optionally specifies arguments that are passed to jobs.Agent, Subagent Cluster
: Specifies the name of a Standalone Agent that is configured in your JS7 scheduling environment. If an Agent Cluster is used then its name is specified as theAgent
and the respectiveSubagent Cluster
is specified, see JS7 - Agent Cluster. If the columns are missing or hold empty values then a default Agent name can be specified from the PowerShell cmdlet's-AgentName
parameter, see next chapters.Script
: Specifies the job script that will be executed. Multiline input is allowed. This column is required and cannot hold empty values.Job Resource
: Specifies one or more JS7 - Job Resources separated by a space that will be assigned the job. This column is not required.Job Class
: Specifies one of two values for Shell jobs or JITL jobs from JS7 - JITL Job Templates. If this column is missing, then then job class Shell is assumed.Job Template
: If aJob Class
with the value JITL is specified then this column holds the Java class name of the job template. This column is required for use with JITL jobs.Error Handling
: Optionally specifies one of the following options for JS7 - How to apply error handling. If the column is missing or holds empty values then then STOP error handling will be applied.- STOP: If the job fails then the order will be put to the FAILED state and will remain with the job. Failed orders require user intervention to resume, to suspend or to cancel the respective order.
- IGNORE: Any job error is ignored and order processing continues, see JS7 - Try-Catch Instruction.
- RETRY: The job will be retried in case of error for 3 times with a delay of 60s, see JS7 - Retry Instruction.
- LEAVE: The order will leave the workflow with an unsuccessful History outcome, see JS7 - Finish Instruction.
Fail on stderr
: Optionally specifies that in case that the job will write output to the stderr channel the job will be considered to be failed, see JS7 - Job Instruction. The column holds Boolean values. If the column is missing or holds empty values then the value False will be applied.
- Users can adjust column names, see JS7 - PowerShell Import from Excel Script - New-JS7WorkflowFromExcel.ps1.
Explanation:
- Job dependencies are designed along a Directed acyclic graph (DAG). This includes a few rules for input to the Excel® spreadsheet:
- One or more initial jobs are required as the graph's root node(s). Initial jobs must not be specified as successors in any of the
SuccessorID
column values. - A final job can be specified that does not indicate a successor job in its
SuccessorID
column. - The order of entries to the Excel® spreadsheet is not relevant.
- Not too much error checking of the DAG is performed, however, missing job nodes will be detected.
- One or more initial jobs are required as the graph's root node(s). Initial jobs must not be specified as successors in any of the
- In a simplified way a directed graph represents job dependencies from two columns:
- The
NodeID
column specifies the unique identifier of the given occurrence of a job. - The
SuccessorID
column specifies one or more NodeIDs of successor jobs that are separated by a space.- If more than one NodeID is specified then this will fork processing of the indicated successor jobs to be performed in parallel branches.
- To join parallel branches of jobs the NodeIDs specified from the
SuccessorID
column of the forking job have to hold empty values in their respectiveSuccessorID
column or hold the same NodeID of a later job. - Forking can include any number of parallel branches specified by NodeIDs in a
SuccessorID
column. - Forking can be nested preferably at not more than 15 levels.
- The
Running the PowerShell Cmdlet
Download the PowerShell cmdlet: New-JS7WorkflowFromExcel.ps1
- Basically two arguments have to be provided:
-ExcelPath
: Specifies the path to the Excel® spreadsheet file that holds the specification of jobs and dependencies.- -
OutputDirectory
: Specifies the directory to which workflow *.json files will be stored.
- JS7 workflow files will be added to the
js7-import.zip
archive file in the current working directory.- A different location and file name can be specified using the
-OutputArchive
parameter. - Users can import the archive file into JS7 using the JOC Cockpit's Configuration view that offers the Import button. The archive file does not include a folder hierarchy but holds workflows from its root. When importing archive files with JOC Cockpit users can specify the target folder to which workflows will be added.
- For details see JS7 - Inventory Export and Import.
- A different location and file name can be specified using the
./New-JS7WorkflowFromExcel.ps1 -ExcelPath /home/sos/excel/input/jobs.xlsx -OutputDirectory /home/sos/excel/output/jobs
./New-JS7WorkflowFromExcel.ps1 -ExcelPath "C:\js7\excel\input\jobs.xlsx" -OutputDirectory "C:\js7\excel\output\jobs"
The above examples will create workflows per each worksheet included with the Excel® spreadsheet file. The name of the workflow corresponds to the worksheet. The resulting js7-import.zip
file will be created in the current working directory.
Users who prefer to run the PowerShell cmdlet directly from the Unix Shell can add the call to the PowerShell cmdlet and its parameters to an executable shell script like this:
#!/usr/bin/env pwsh $env:SCRIPT_HOME=$(cd "$(dirname "`$0")" >/dev/null && pwd) Set-Location -Path $env:SCRIPT_HOME ./New-JS7WorkflowFromExcel.ps1 ` -ExcelPath "$($env:SCRIPT_HOME)/input/jobs.xlsx" ` -OutputDirectory "$($env:SCRIPT_HOME)/output/jobs"
Managing Job Arguments
Job arguments optionally are specified from the following formats:
Separator | Quoting | Data Type | Example | Argument Usage | |
---|---|---|---|---|---|
JSON Style | comma | yes | numeric | { "p_calc_id": 365985, "p_calc_ord": 1 } | Default: numeric data type |
comma | no | numeric | { p_calc_id : 365985, p_calc_ord : 1 } | Default: numeric data type | |
comma | yes | string | { "p_calc_id": "AB321", "p_calc_ord": "RollingOrder" } | Default: string data type | |
comma | yes | Boolean | { is_valid: true, is_invalid: false } | Default: Boolean data type | |
HashTable Style | comma | no | string | p_calc_id = AB321, p_calc_ord=RollingOrder | Default separator is the comma |
semicolon | no | numeric | p_calc_id = 365985 ; p_calc_ord=RollingOrders | Requires -ArgumentSeparator ';' | |
newline | no | string | p_calc_id = AB321 p_calc_ord=RollingOrders | Default: arguments separated by newline |
Explanation:
- JSON style arguments are specified using
{ ... }
. Otherwise arguments are considered being specified in HashTable style. - JSON style arguments require the comma to be used as a separator. Quoting of argument names is optional.
- HashTable style arguments by default consider the comma or newline used as a separator.
- The argument separator can be specified using
-ArgumentSeparator '<separator-character>'
. - Data types for string input and numeric input are dynamically assigned.
- The argument separator can be specified using
- HashTable style arguments do not suggest quoting.
- For both JSON style arguments and HashTable style arguments spaces between argument names, values and separators are not relevant.
Automating Import
Users who wish to automate the process of importing workflows to JOC Cockpit can use the JS7 - PowerShell Module that performs operations based on the JS7 - REST Web Service API.
- To establish the connection to JOC Cockpit see JS7 - How to connect to JOC Cockpit using the PowerShell Module.
- The .zip archive file that is created in the previous step from the
New-JS7WorkflowFromExcel.ps1
cmdlet is specified for import into JOC Cockpit. - Users can perform additional operations, for example to deploy imported workflows, see PowerShell CLI 2.0 - Cmdlets - Inventory Management.
Import-Module JS7 Connect-JS7 -Url http://root:root@localhost:4446 -Id controller # connect using the default root account and root password, check to apply more secure connection methods Import-JS7InventoryItem -TargetFolder /imported -FilePath /js7/excel/js7-import.zip -Overwrite # imports the .zip archive to the JOC Cockpit's /imported folder and overwrites existing workflows with the same name Disconnect-JS7
Examples
Examples for advanced parameterization are provided as follows.
Create Workflows for specific Worksheets and Column Names
./New-JS7WorkflowFromExcel.ps1 ` -ExcelPath /home/sos/excel/input/jobs.xlsx ` -WorksheetName Workflow01 ` -WorksheetColumns @{ 'nodeid' = 'NodeID'; 'successorid' = 'SuccessorID'; 'instruction' = 'Instruction'; 'name' = 'Name'; 'description' = 'Description'; 'argument' = 'Argument'; 'agent' = 'Agent'; 'subagentCluster' = 'Subagent Cluster'; 'script' = 'Script'; 'errorHandling' = 'Error Handling'; 'failOnStderr' = 'Fail on stderr' } ` -OutputDirectory /home/sos/excel/output/jobs ` -WorkflowName EOD-Workflow ` -Title "EOD Processing" ` -AgentName primaryAgent
Explanation:
- The
-WorksheetName
parameter is used to limit creation of workflows to the given worksheet within the Excel® spreadsheet file. - The call to the PowerShell cmdlet specifies the
-WorksheetColumns
parameter that holds a hashtable for the mapping of internal names and column names in the given spreadsheet file. This allows to use different column names in a spreadsheet file that will be mapped to internal names. Any additional columns in a spreadsheet are ignored. - The
-WorkflowName
parameter specifies the name of the workflow. If this parameter is omitted then the workflow name will be used from the name of the worksheet. - The
-Title
parameter specifies the title that will be assigned the workflow. - The
-AgentName
parameter specifies the name of the Agent that will be assigned to jobs for which the Excel® spreadsheet file does not specify a value in the related column.
Create Workflows from a number of Excel® Spreadsheet Files
Get-ChildItem *.xlsx | Foreach-Object { ./New-JS7WorkflowFromExcel.ps1 ` -ExcelPath $_.FullName ` -Title $_.BaseName ` -OutputDirectory /home/sos/excel/output/jobs }
Explanation:
- The
Get-ChildItem
Cmdlet will select any Excel® spreadsheet files in the current directory. Each Excel® file is pipelined to the PowerShell cmdlet. - This allows to create workflows from a larger number of Excel® spreadsheet files in a bulk operation.