Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinh1. true
outlinh1. true
1printablefalse
2stylh1. none
3indent20px

Introduction

The JITL FileOrderVariablesJob MonitoringJob template can be used to read variables from incoming files:

  • this applies to incoming files that are subject to JS7 - File Watching.
  • this applies to any files that are accessible to jobs in a workflow.

The following prerequisites are valid for incoming files:

  • The incoming file can hold pairs of names and values similar to the temporary file available from the JS7_RETURN_VALUES environment variable for return values of shell jobs: <name>=<value>.
    • pairs of names/values are separated from each other by EOL which can be CR, CRLF or LF.
    • The <value> can include any Unicode characters.
  • For the handling of incoming files the following applies:
    • If the incoming file holds pairs of names and values for variables then they are added to the order as return values from the JITL FileOrderVariablesJob template.
    • If the incoming file is empty then no return values are created and no error occurs.
    • If the incoming file includes content that does not match name/value pairs then an error is raised.

The JITL job does not use any arguments to specify its processing mode. Instead, users can add any arguments to this job that are expected to match the name of a name/value pair in the incoming file:

...

  • If the incoming file does not provide the variable then a return value with the JITL Job's default value is created.
  • If the incoming file provides a variable - optionally with an empty value - then a return value with the value from the incoming file is created.

...

to perform health checks of JS7 JOC Cockpit, Controller and Agents. Health check results can be forwarded, for example by mail.

  • Users can use health status results for integration with their monitoring system.
  • SOS offers a 24/7 Monitoring Service to receive health status results for customers using a commercial license and who subscribe to this support option, see JS7 - License.

The JITL MonitoringJob template can be used as a building block in a monitoring solution to:

  • repeatedly run the MonitoringJob template using a JS7 - Cycle Instruction,
  • forward health check results to a monitoring solution.
    • When used with a user's monitoring solution, this can include forwarding health check report files.
    • This can include sending e-mails containing a notice or an alert to SOS. Such notices do not include any data related to the user's JS7 environment, they only indicate a notice or alert.
      • Alert mails are simplistic like this
      • Image Added

The job template makes use of the JS7 - REST Web Service API to retrieve information from the JOC Cockpit.

...

  • .

Display feature availability
StartingFromRelease2.4.01

Usage

When defining the job either:

  • invoke the Wizard that is available from the job properties tab in the Configuration view and select the JITL FileOrderVariablesJob MonitoringJob and relevant arguments from the Wizard

...

  • specify the JITL job class and com.sos.jitl.jobs.fileordervariablesjobmonitoring.FileOrderVariablesJobMonitoringJob Java class name and add arguments specifying the variables that are expected to be carried by incoming filesrequired arguments.

Example

Download (upload .json): pdfFileVariablespdmMonitoring.workflow.json

Using the Example

It is recommended to use the example as a starting point and to adjust the parameterization:

Image Added


Explanation:

  • A JS7 - Cycle Instruction is used in order to repeatedly perform health status checks.
    • Users should adjust cycles to their monitoring needs.
  • JS7 - Retry Instruction is used in order to retry execution, for example of the MailJob included in case that e-mail cannot be sent.
  • The MonitoringJob is used to perform the health status check.
  • The MailJob is used to send notices and alerts by mail. This is an option - users might apply other means to forward notices and alerts.


The Cycle Instruction is configured like this:

Image Added


Explanation:

  • A ticking cycle is used in order to perform health status checks precisely at the given hour and minute.
  • The cycle runs in hourly intervals for any days of week.
  • The cycle period starts at midnight and lasts 24 hours.
  • This example results in 24/7 coverage with the health status check being performed every hour.


The Retry Instruction is configured like this:

Image Added


Explanation:

  • If any of the jobs included in the Retry Instruction fails then execution is resumed starting from the first job.
  • Execution is repeated up to 3 times unless successful. The same interval of 1 minute is applied for each retry.


The MonitoringJob makes use of arguments that are explained with chapter Using the Job Wizard for the MonitoringJob.

The MailJob is explained from the JS7 - JITL MailJob article.

Anchor
using_wizard_monitoringjob
using_wizard_monitoringjob
Using the Job Wizard for the MonitoringJob

You can use the job wizard like this:

Image RemovedImage Added


Explanation:

  • Add an empty job from the instruction panel.
  • Specify a name and a label for the job.
  • Select an Agent.

In a next step invoke the job wizard that you find in the upper right corner of the job property editor. The wizard brings up the following popup window:

Image RemovedImage Added


Explanation:

  • From the list of available job templates select the FileOrderVariablesJob MonitoringJob.

Then hit the "Next" button to make the job wizard display available arguments:

Image RemovedImage Added


Explanation:

  • controller_id: Optionally specifies the identification of the Controller to be checked. By default the current Controller is used.
  • monitor_report_dir: Specifies the directory in which the job will store health status report files (.json). The directory has to exist prior to running the job and has to be in reach of the Agent that runs the job. 
    • An absolute or a relative path can be specified.
    • An expression can be used. The example makes use of env('JS7_AGENT_DATA') ++ '/monitor' which translates to use of the JS7_AGENT_DATA environment variable created by the Agent's start script, see JS7 - Job Environment Variables. This environment variable can for example evaluate to /var/sos-berlin.com/js7/agent. The ++ operator indicates concatenation and is followed by the name of a sub-directory. In this example the report directory will be /var/sos-berlin.com/js7/agent/monitor.
  • monitor_report_max_files: The number of report files created will be limited to this value. Older report files will be removed when this value is exceeded.
  • from: Specifies the e-mail address that is used to send mail for notices and alerts. The argument is used by the job to create the subject and body return variables for use with a later MailJob.
  • max_failed_orders: The maximum number of failed orders that are considered acceptable for a health status check. If this number is exceeded then the result return variable will carry a non-zero value indicating a failed health check
  • There are optional arguments for this job to identify the file carrying variables:
    • if the job is used in a workflow to which orders are added from JS7 - File Watching then the job will use the internal file variable that specifies the incoming file and does not require an argument.
    • if the job is used in a workflow not subject to File Watching then the js7_source_file argument is used to specify the path to the file carrying variables.
  • Any additional arguments specify variables that are expected from the incoming file. The example makes use of the var1 argument which translates to the fact that from the incoming file a line is expected to specify the variable like this: var1=some value.
  • Select the check box provided with each argument if you want this argument to be added to the arguments of the FileOrderVariablesJob MonitoringJob template.

When hitting the Submit button the wizard adds the required arguments to the job which should look like this:

Image Added

Using the Job Wizard for the MailJob

Find instructions from the JS7 - JITL MailJob article.

Use of JS7 - Job Resources to specify mail parameterization is encouraged.

Health Status Check

The health status check performed by the MonitoringJob makes use of the JS7 REST API

  • to retrieve such information,
  • to write this information to a report file,
  • to evaluate if the information indicates a healthy JS7 environment.

Report File

Find a sample report file for download that indicates an alert: monitor.2022-08-17.09-16-44.9Z.alert.json

Code Block
titleSample Report File
collapsetrue
{
  "controllerStatus" : {
    "active" : {
      "id" : 3,
      "surveyDate" : "2022-08-17T08:57:43.000+00:00",
      "controllerId" : "testsuite",
      "title" : "SECONDARY CONTROLLER",
      "host" : "controller-2-0-secondary",
      "url" : "https://controller-2-0-secondary:4443",
      "clusterUrl" : "https://controller-2-0-secondary:4443",
      "role" : "BACKUP",
      "isCoupled" : false,
      "startedAt" : "2022-08-16T18:09:27.000+00:00",
      "version" : "2.5.0-SNAPSHOT+fd0eb39",
      "javaVersion" : "17.0.4+8-alpine-r0",
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM"
    },
    "volatileStatus" : {
      "id" : 2,
      "surveyDate" : "2022-08-17T09:16:45.064+00:00",
      "controllerId" : "testsuite",
      "title" : "PRIMARY CONTROLLER",
      "host" : "controller-2-0-primary",
      "url" : "https://controller-2-0-primary:4443",
      "clusterUrl" : "https://controller-2-0-primary:4443",
      "role" : "PRIMARY",
      "isCoupled" : true,
      "startedAt" : "2022-08-16T18:09:26.004+00:00",
      "version" : "2.5.0-SNAPSHOT+fd0eb39",
      "javaVersion" : "17.0.4+8-alpine-r0",
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM",
      "componentState" : {
        "severity" : 0,
        "_text" : "operational"
      },
      "connectionState" : {
        "severity" : 0,
        "_text" : "established"
      },
      "clusterNodeState" : {
        "severity" : 0,
        "_text" : "active"
      }
    },
    "permanentStatus" : {
      "id" : 2,
      "surveyDate" : "2022-08-16T18:12:47.169+00:00",
      "controllerId" : "testsuite",
      "title" : "PRIMARY CONTROLLER",
      "host" : "controller-2-0-primary",
      "url" : "https://controller-2-0-primary:4443",
      "clusterUrl" : "https://controller-2-0-primary:4443",
      "role" : "PRIMARY",
      "startedAt" : "2022-08-16T18:09:26.004+00:00",
      "version" : "2.5.0-SNAPSHOT+fd0eb39",
      "javaVersion" : "17.0.4+8-alpine-r0",
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      }
    }
  },
  "jocStatus" : {
    "active" : {
      "id" : 2,
      "memberId" : "joc-2-0-primary:97c88ccc3975703ebd0b7277d394ec8768f88b31775e8df038572d2547c240a0",
      "title" : "PRIMARY JOC COCKPIT",
      "current" : true,
      "host" : "joc-2-0-primary",
      "url" : "https://joc-2-0-primary:4443",
      "startedAt" : "2022-08-16T18:10:27.000+00:00",
      "version" : "2.5.0-SNAPSHOT",
      "connectionState" : {
        "severity" : 0,
        "_text" : "established"
      },
      "componentState" : {
        "severity" : 0,
        "_text" : "operational"
      },
      "clusterNodeState" : {
        "severity" : 0,
        "_text" : "active"
      },
      "controllerConnectionStates" : [ {
        "role" : "PRIMARY",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      }, {
        "role" : "BACKUP",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      } ],
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM",
      "lastHeartbeat" : "2022-08-17T09:16:37.000+00:00"
    },
    "passive" : [ {
      "id" : 1,
      "memberId" : "joc-2-0-secondary:97c88ccc3975703ebd0b7277d394ec8768f88b31775e8df038572d2547c240a0",
      "title" : "SECONDARY JOC COCKPIT",
      "current" : false,
      "host" : "joc-2-0-secondary",
      "url" : "https://joc-2-0-secondary.sos:7543",
      "startedAt" : "2022-08-16T18:10:27.000+00:00",
      "version" : "2.5.0-SNAPSHOT",
      "connectionState" : {
        "severity" : 0,
        "_text" : "established"
      },
      "componentState" : {
        "severity" : 0,
        "_text" : "operational"
      },
      "clusterNodeState" : {
        "severity" : 1,
        "_text" : "inactive"
      },
      "controllerConnectionStates" : [ {
        "role" : "PRIMARY",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      }, {
        "role" : "BACKUP",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      } ],
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM",
      "lastHeartbeat" : "2022-08-17T09:16:37.000+00:00"
    } ]
  },
  "agentStatus" : [ {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_001",
    "agentName" : "primaryAgent",
    "url" : "https://agent-2-0-primary:4443",
    "version" : "2.5.0-SNAPSHOT",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 1,
    "isClusterWatcher" : true,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_002",
    "agentName" : "secondaryAgent",
    "url" : "https://agent-2-0-secondary:4443",
    "version" : "2.5.0-SNAPSHOT",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_004",
    "agentName" : "wintestAgent",
    "url" : "http://192.11.0.146:4245",
    "version" : "2.4.0",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_005",
    "agentName" : "apmaccsAgent",
    "url" : "http://192.11.3.3:4449",
    "state" : {
      "severity" : 2,
      "_text" : "UNKNOWN"
    },
    "healthState" : {
      "severity" : 2,
      "_text" : "NO_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : true
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_006",
    "agentName" : "apmacwinAgent",
    "url" : "http://192.11.2.2:4245",
    "state" : {
      "severity" : 2,
      "_text" : "UNKNOWN"
    },
    "healthState" : {
      "severity" : 2,
      "_text" : "NO_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : true
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_101",
    "agentName" : "agent17",
    "url" : "http://centostest_primary.sos:7775",
    "version" : "2.4.0-beta.20220714",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_009",
    "agentName" : "oracleAgent",
    "url" : "http://minos.sos:4445",
    "version" : "2.4.0-beta.20220714",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ {
      "isDirector" : "PRIMARY_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "director_primary_001",
      "url" : "https://diragent-2-0-primary:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    }, {
      "isDirector" : "NO_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "subagent_primary_001",
      "url" : "https://subagent-2-0-primary:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    }, {
      "isDirector" : "NO_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "subagent_secondary_001",
      "url" : "https://subagent-2-0-secondary:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    }, {
      "isDirector" : "NO_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "subagent_third_001",
      "url" : "https://subagent-2-0-third:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    } ],
    "controllerId" : "testsuite",
    "agentId" : "agent_cluster_001",
    "agentName" : "AgentCluster001",
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_014",
    "agentName" : "winutf8Agent",
    "url" : "http://192.11.0.146:4445",
    "version" : "2.4.0",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  } ],
  "orderSnapshot" : {
    "pending" : 0,
    "scheduled" : 1262,
    "inProgress" : 0,
    "running" : 1,
    "prompting" : 0,
    "suspended" : 0,
    "waiting" : 770,
    "blocked" : 0,
    "failed" : 0,
    "terminated" : 1
  },
  "orderSummary" : {
    "failed" : 0
  }
}

Health Status Checks

The MonitoringJob performs the following health status checks:

  • Controller
    • In volatileStatus the element connectionStates includes severity with a value 0.
    • In volatileStatus the element componentState includes severity with a value 0.
    • If role is present and does not carry the value STANDALONE in volatileStatus then the element clusterNodeState has to have severity with a value 0.
    • If role is present and does not contain the value STANDALONE in volatileStatus then the element isCoupled has to have the value true.
  • Agents
    • In agentStatus the healthState is present and has severity with a value 0.
    • In agentStatus the state is present and has severity with a value 0.
    • For each enabled subAgent the state has severity with a value 0.
  • JOC Cockpit
    • The connectionState has severity with a value 0.
    • The componentState has severity with a value 0.
    • If clusterNodeState is present it has severity with a  value 0.
    • If controllerConnectionStates is present each connectionState has severity with a value 0.

The number of failed checks is reported by the result return variable, see next section.
Image Removed

Documentation

The Job Documentation including the full list of arguments can be found under: https://www.sos-berlin.com/doc/JS7-JITL/FileOrderVariablesJob.xmlMonitoringJob.xml

Authentication

The Job makes use of the JS7 - REST Web Service API that is available from JOC Cockpit. 

  • The job is executed with an Agent and requires a network connection to JOC Cockpit.
  • The job has to authenticate with JOC Cockpit, for the related configuration see JS7 - JITL Common Authentication.

Arguments

The MonitoringJobThe FileOrderVariablesJob class accepts the following arguments:

var1
NameRequiredDefault ValuePurposeExample
js7controller_source_fileidno

Specifies the path to the incoming file:

  • If the workflow in use is subject to File Watching then this argument is not required as the internal file variable will be used.
  • If the workflow is not subject to File Watching then the path to the incoming files has to be specified with the js7_source_file variable.
/tmp/file/some_file.csv

<variable>

yes

Specifies a variable that is expected from the incoming file:

  • If a value is specified then it is applied as a default value in case that the incoming file will not provide the variable.
  • If no value is specified then then incoming file is required to specify the variable and otherwise an error is raised.

Any number of variables can be specified like this.


Optionally specifies the identification of the Controller to be checked. By default the current Controller is used.

controller_prod

monitor_report_dir

yes

Specifies the directory to which the job will store health status report files (.json). This directory has to exist prior to running the job and has to be in reach of the Agent that runs the job. 

    • An absolute or relative path can be specified.
    • An expression can be used., for example  env('JS7_AGENT_DATA') ++ '/monitor' 

env('JS7_AGENT_DATA') ++ '/monitor'

/var/sos-berlin.com/js7/agent/monitor

C:\ProgramData\sos-berlin.com\js7\agent\monitor

monitor_report_max_filesyes
The number of report files created will be limited to this value. Older report files will be removed when this value is exceeded25
fromyes

Specifies the e-mail address that is used to send mail for notices and alerts. The argument is used by the job to create the subject and body return variables.

js7@example.com
max_failed_ordersno

The maximum number of failed orders that are considered acceptable for a health status check. If this number is exceeded then the result return variable will carry a non-zero value indicating a failed health status check.

By default the number of failed orders is not considered for successful/unsuccessful health status checks.

3

Return Variables

The MonitoringJob class returns the following variables for use by subsequent jobs:

NameData TypePurposeExample
monitor_report_dateString

The date and time for which the health status check has been performed. The date format is yyyy-MM-dd.HH-mm-ss.K, for example 2022-07-31.23-12-59.Z indicating UTC time

controller_prod
monitor_report_fileStringThe path to the report file created for the health status check./var/sos-berlin.com/js7/agent/monitor/monitor.2022-08-15.17-35-36.5.json
subjectString

The subject of an e-mail for use with a later MailJob.

JS7 Monitor: Notice from: js7@sos-berlin.com at: 2022-08-15.17-35-36.5
bodyString

The body of an e-mail for use with a later MailJob, by default the value is the same as for the subject.

JS7 Monitor: Notice from: js7@sos-berlin.com at: 2022-08-15.17-35-36.5
resultNumberThe number of problems identified during the health status check. A value 0 indicates absence of problems, other values indicate existence of problems.0

Further Resources