Table of Contents |
---|
Introduction
This article describes individual configuration parameters and provides examples of their use with monitors such as op5 and Zabbix and using of the mail und JMS interfaces.
Send notifications
Notify on error
is configured.SystemMonitorNotification / Notification /
NotificationMonitor / @service_name_on_error
SystemMonitorNotification / Notification / NotificationObjects / JobChain
Error messages
- will be sent:
- when at the time of the sos / notification / SystemNotifier run, an order is in a job chain state(step) that has ended with an error
- will not be sent:
- when after the last run of the sos / notification / SystemNotifier, an error has occured in an job chain state(step) but at the time of the current sos / notification / SystemNotifier run this order is in an other/next job chain state(step)
- this kind of error is ignored because the order has continued to run
- when an error has reoccurred in the same job chain state(step) where a notification has already been sent
- this order state is considered as notified and no new notification will be sent
- e.g. an job chain state(step) has been restarted manually or by a setback
- this behaviour has been changed with
- providing support for repeatedly failed executions.Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-534
- this order state is considered as notified and no new notification will be sent
- when the first step of the specific order has been removed from the notification tables by sos / notification / CleanupNotifications
- this behaviour has been changed with
- providing support for long running ordersJira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-516
- this behaviour has been changed with
- when a notification maximum has been reached
- when a job chain state(step) has been configured as excluded
- when the
@step_from
or@step_to
settings have been configured and the job chain state(step) is out of the configured range - when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job chain state(step) is out of the configured range
- when after the last run of the sos / notification / SystemNotifier, an error has occured in an job chain state(step) but at the time of the current sos / notification / SystemNotifier run this order is in an other/next job chain state(step)
Recovery messages
- will be automatically sent using the same service name and message as the relevant error message:
- when the error message of a job chain state(step) has already been sent and the order at the time of the current sos / notification / SystemNotifier run is in an other/next state(step)
- e.g. the rerun of the error state(step) has been successfull and the order has been moved to the next job chain state(step)
- Note: use
${SERVICE_STATUS}
and${SERVICE_MESSAGE_PREFIX}
variables to differentiate between recovery and error message
- when the error message of a job chain state(step) has already been sent and the order at the time of the current sos / notification / SystemNotifier run is in an other/next state(step)
- will not be sent:
- when a job chain state(step) has recovered after the last run of the sos / notification / SystemNotifier but at the time of the current sos / notification / SystemNotifier run a new error in the other/next step has occured
SystemMonitorNotification / Notification / NotificationObjects / Job
Error messages
- will be sent:
- when a job chain state(step) or standalone job (JobScheduler versions from 1.12) ends with an error
- will not be sent:
- when a notification maximum has been reached
- when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job is out of the configured range
Notify on success
is configured.SystemMonitorNotification / Notification /
NotificationMonitor / @service_name_on_success
SystemMonitorNotification / Notification / NotificationObjects / JobChain
Success messages
- will be sent:
- when an order is completed and the last job chain state(step) has no error
- will not be sent:
- when a notification maximum has been reached
- when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job chain state(step) is out of the configured range
SystemMonitorNotification / Notification / NotificationObjects / Job
Success messages
- will be sent:
- when a job chain state(step) or standalone job (JobScheduler versions from 1.12) ends without error
- will not be sent:
- when a notification maximum has been reached
- when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job is out of the configured range
Configuration Editor
We recommend that the XML Editor is used generate monitoring configuration objects. This editor automatically uses an XSD Schema
to generate configuration suggestions and validate configurations, and its use is intended to provide a significant reduction in the time required to develop and test a configuration.
XSD Schema locations
<scheduler_data>/config/notification
/SystemMonitorNotification_v1.0.xsd
- https://www.sos-berlin.com/schema/jobscheduler/SystemMonitorNotification_v1.0.xsd
Configuration
JobScheduler
Activation of Monitoring Interface
- JobScheduler version 1.9.x, 1.10.x
- JobScheduler version 1.11.x, 1.12.x
- Set param
sos.use_notification true (config/scheduler.xml)
- see JobScheduler - Job Chains
- Set param
Note:
file(s) (see below) must be configured before activtion.SystemMonitorNotification_<MonitorSystem>.xml
SystemMonitorNotification files
Location: <scheduler_data>/config/notification
File | Description |
---|---|
SystemMonitorNotification_v1.0.xsd | The XML Schema file defines which values are allowed in your XML files for the JobScheduler monitoring. That means that to configure the JobScheduler objects you want to monitor and the System Monitor you just have to modify your |
SystemMonitorNotification_<MonitorSystem>.xml | Configuration file for each System Monitor.
|
| Configuration file for all System Monitors.
This file is optional and contains the definitions of the |
SystemMonitorNotification Elements
The configuration element descriptions are organized into the following major categories:
Element | Element description | Description |
---|---|---|
SystemMonitorNotification | Top Level Element | Configuration for notifications to be sent to a system monitor. |
Notification | Required, multiple use allowed inside the SystemMonitorNotification element | Specifies a system monitor notification that includes a command line invocation and the JobScheduler objects. |
Timer | Optional or multiple use allowed inside the SystemMonitorNotification element | Performance measurement definition. |
SystemMonitorNotification
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5"> ... |
SystemMonitorNotification / Notification
Notification
supports the following attributes:
...
Element | Element description | Description |
---|---|---|
NotificationMonitor | Required, only once inside the Notification element | Specifies the System Monitor interface that is being used for messages: either by a Plug-in Interface or by command line invocation |
NotificationObjects | Required, only once inside the Notification element | Specifies the Job Chain and the Timer definitions |
SystemMonitorNotification / Notification / NotificationMonitor
The JobScheduler Interface Monitor can be used to monitor the messages for the 3 use cases:
...
Element | Element description | Description |
---|---|---|
NotificationInterface | Optional or only once inside the NotificationMonitor element | NSCA plug-in Interface to be executed for System Monitor notification |
NotificationCommand | Optional or only once inside the NotificationMonitor element | Command line to be executed for System Monitor notification |
NotificationMail | Optional or only once inside the NotificationMonitor element | Mail interface to be executed for System Monitor notification |
NotificationJMS | Optional or only once inside the NotificationMonitor element | JMS interface to be executed for System Monitor notification |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationInterface
NSCA plug-in Interface to be executed for System Monitor notification.
...
Attribute | Usage | Description |
---|---|---|
monitor_host | Required | This setting specifies the host name or ip address of System Monitor host. |
monitor_port | Required | This setting specifies the TCP port that the System Monitor would listen to. |
monitor_password | Optional | This setting specifies the password
|
monitor_connection_timeout | Optional | This setting specifies the connection timeout in ms. Default: |
monitor_response_timeout | Optional | This setting specifies the response timeout in ms. |
monitor_encryption | Optional | This setting specifies that the communication with the System Monitor is encrypted. By default no encryption is used.
|
service_host | Required | This setting specifies the name of the host that executes the passive check. The name must match the corresponding setting in the System Monitor. |
plugin | Optional | Default:
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <NotificationInterface monitor_host="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[ scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step =${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} ]]></NotificationInterface> ... |
Note | ||
---|---|---|
| ||
In case you are using Opsview as the monitoring tool, the plugin used in Instead, you should use the XML element |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationCommand
Command line to be executed for System Monitor notification.
...
Attribute | Usage | Description |
---|---|---|
plugin | Optional | Default:
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <NotificationCommand><![CDATA[ echo scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step =${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} > D://errors.txt ]]></NotificationCommand> ... |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1388
...
Element | Element description | Description |
---|---|---|
From | Optional or only once inside of the NotificationMail element | E-mail address of the account that sends e-mail. |
To | Optional or only once inside of the element | E-mail address of the recipient(s) of a notification e-mail. |
CC | Optional or only once inside of the NotificationMail element | E-mail address of the recipient(s) of a carbon copy notification e-mail. |
BCC | Optional or only once inside of the NotificationMail element | E-mail address of recipient(s) of a blind carbon copy notification e-mail. |
Subject | Required, only once inside of the NotificationMail element | Subject of an e-mail notification. |
Body | Required, only once inside of the NotificationMail element | Body of an e-mail notification. |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / From
E-mail address of the account that sends the e-mail.
The mail notification interface uses the value of the log_mail_from
entry (configuration file config/factory.ini
) when this element is not set.
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / To
E-mail address of the recipient(s) of a notification e-mail.
...
- is not set
will be usedlog_mail_to
- is set
-
log_mail_to
, log_mail_cc, log_mail_bcc
are not used
-
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / CC
E-mail address of the recipient(s) of a carbon copy notification e-mail.
...
- is not set
will be used (if thelog_mail_cc
NotificationMail/To
element is not defined - see above)
- is set
-
log_mail_cc, log_mail_bcc
are not used
-
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / BCC
E-mail address of recipient(s) of a blind carbon copy notification e-mail.
...
- is not set
will be used (if thelog_mail_bcc
NotificationMail/To
orNotificationMail/CC
elements are not defined - see above)
- is set
-
log_mail_bcc
are not used
-
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / Subject
Subject of an e-mail notification.
The Subject
can contain the JobScheduler Monitoring Interface variables.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Subject><![CDATA[JobScheduler notification: ${SERVICE_MESSAGE_PREFIX}, job executed with errors: ${MON_N_JOB_NAME}]]></Subject> ... |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / Body
Body of an e-mail notification.
The Body
can contain the JobScheduler Monitoring Interface variables.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Body><![CDATA[<style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;border-color:#bbb;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#594F4F;background-color:#E0FFEB;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#493F3F;background-color:#9DE0AD;} </style> <table class="tg"> <tr> <th colspan="4">Error</th> </tr> <tr> <td>Code:</td><td>${MON_N_ERROR_CODE}</td> <td>Messsage</td><td>${MON_N_ERROR_TEXT}</td> </tr> <tr> <th colspan="4">JobScheduler</th> </tr> <tr> <td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td> <td>Agent URL</td><td>${MON_N_AGENT_URL}</td> </tr> <tr> <th colspan="4">Order</th> </tr> <tr> <td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td> <td>Order Title</td><td>${MON_N_ORDER_TITLE}</td> </tr> <tr> <td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td> <td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td> </tr> <tr> <td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td> <td>Job Title</td><td>${MON_N_JOB_TITLE}</td> </tr> <tr> <th colspan="4">Task History</th> </tr> <tr> <td>Task ID</td><td>${MON_N_TASK_ID}</td> <td>Time elapsed</td><td>${MON_N_TASK_TIME_ELAPSED}</td> </tr> <tr> <td>Start Time</td><td>${MON_N_TASK_START_TIME}</td> <td>End Time</td><td>${MON_N_TASK_END_TIME}</td> </tr> </table>]]></Body> ... |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-280
...
Element | Element description | Description |
---|---|---|
Message | Required, only once inside of NotificationJMS element | Body of a JMS notification |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionFactory
Specifies use of a JMS ConnectionFactory implementation.
...
Element | Element description | Description |
---|---|---|
ConstructorArguments | Optional or only once inside of ConnectionFactory element |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionFactory / ConstructorArguments
The following elements can be nested inside a ConstructorArguments
element:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <ConnectionFactory java_class="org.apache.activemq.ActiveMQConnectionFactory"> <ConstructorArguments> <Argument type="java.lang.String"><![CDATA[my_user_name]]></Argument> <Argument type="java.lang.String"><![CDATA[my_password]]></Argument> <Argument type="java.lang.String"><![CDATA[tcp://localhost:61616]]></Argument> </ConstructorArguments> </ConnectionFactory> ... |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionFactory / ConstructorArguments / Argument
Argument
supports the following attributes:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Argument type="java.lang.String"><![CDATA[tcp://localhost:61616]]></Argument> ... |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionJNDI
Specifies use of a JNDI properties file to create a JNDI IntialContextFactory.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
java.naming.factory.initial=org.apache.activemq.jndi.ActiveMQInitialContextFactory java.naming.provider.url=tcp://localhost:61616 |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / Message
Body of a JMS notification.
SystemMonitorNotification / Notification / NotificationObjects
One of the following elements must be nested inside a NotificationObjects
element:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5"> <Notification> <NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors"> ... </NotificationMonitor> <NotificationObjects> <!-- Send the job error, occurrent in the "test/my_job" order job, to the "JobScheduler Monitoring Errors" service. --> <Job name="test/my_job" /> <!-- Send the job chain error, occurrent in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Errors" service. --> <JobChain name="test/my_jobchain" /> </NotificationObjects> </Notification> </SystemMonitorNotification> |
SystemMonitorNotification / Notification / NotificationObjects / Job
This element specifies the order-controlled
or
jobs for which notifications are being sent to a system monitor.standalone
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Job notifications="2" name="test/my_job"/> ... <Job scheduler_id="scheduler_4444" /> ... <Job scheduler_id="scheduler_4444" name="test/my_.*" /> ... <Job name="test/my_job" return_code_from="5"/> ... <Job name="test/my_job" return_code_to="10"/> ... <Job name="test/my_job" return_code_from="5" return_code_to="5"/> ... |
SystemMonitorNotification / Notification / NotificationObjects / JobChain
Specifies the job chains for which notifications are being sent to a system monitor.
The element can be repeatedly used to specifiy a number of job chains.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <JobChain notifications="2" name="test/my_jobchain"/> ... <JobChain scheduler_id="scheduler_4444" /> ... <JobChain scheduler_id="scheduler_4444" name="test/my_.*" /> ... <JobChain name="test/my_jobchain" return_code_from="5"/> ... <JobChain name="test/my_jobchain" return_code_to="10"/> ... <JobChain name="test/my_jobchain" return_code_from="5" return_code_to="5"/> ... <JobChain name="test/my_jobchain" step_from="200"/> ... <JobChain name="test/my_jobchain" step_to="500"/> ... <JobChain name="test/my_jobchain" step_from="300" step_to="300"/> ... <JobChain name="test/my_jobchain" excluded_steps="200;300"/> ... <JobChain name="test/my_jobchain"> <NotifyRepeatedError /> </JobChain> ... |
SystemMonitorNotification / Notification / NotificationObjects / JobChain / NotifyRepeatedError
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <JobChain name="test/my_jobchain"> <NotifyRepeatedError> <NotifyByIntervention /> </NotifyRepeatedError> </JobChain> ... <JobChain name="test/my_jobchain"> <NotifyRepeatedError> <NotifyByPeriod period="5h 30m" /> </NotifyRepeatedError> </JobChain> ... <JobChain name="test/my_jobchain"> <NotifyRepeatedError> <NotifyByIntervention /> <NotifyByPeriod period="2h" /> </NotifyRepeatedError> </JobChain> ... |
SystemMonitorNotification / Notification / NotificationObjects / JobChain / NotifyRepeatedError / NotifyByIntervention
Send notifications for errors that occur due to repeated failed executions if the restart was caused by manual intervention.
SystemMonitorNotification / Notification / NotificationObjects / JobChain / NotifyRepeatedError / NotifyByPeriod
Send notifications for errors that occur due to repeatedly failed executions if a configurable period of time is exceeded.
...
Attribute | Usage | Description |
---|---|---|
period | Required | The period between notifications is calculated from the time of the last failed execution for which a notification has been sent and the time of the current failed execution. Possible values:
|
SystemMonitorNotification / Notification / NotificationObjects / TimerRef
TimerRef
supports the following attributes:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5"> <Notification> <NotificationMonitor service_name_on_error="JobScheduler Monitoring Error"> ... </NotificationMonitor> <NotificationObjects> <!-- Send the job chain error, occurring in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Errors" service. --> <JobChain name="test/my_jobchain" /> </NotificationObjects> </Notification> <Notification> <NotificationMonitor service_name_on_error="JobScheduler Monitoring Performance"> ... </NotificationMonitor> <NotificationObjects> <!-- Sends the performance check error, occurring in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Performance" service. Sends the performance check error to the "JobScheduler Monitoring Performance" service will be ignored when the "test/my_jobchain" has the job chain error (default notify_on_error = false). --> <TimerRef ref="my_timer" /> </NotificationObjects> </Notification> <Timer name="my_timer"> <TimerJobChain name="test/my_jobchain" /> </Timer> </SystemMonitorNotification> |
SystemMonitorNotification / Notification / NotificationObjects / MasterMessage
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1837
...
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Notification / NotificationObjects / TaskWarning
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1837
...
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Notification / NotificationObjects / TaskIfLongerThan
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-522
...
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Notification / NotificationObjects / TaskIfShorterThan
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-522
...
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Timer
The following elements must be nested inside a Timer
element:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Timer name="my_timer"> ... |
SystemMonitorNotification / Timer / TimerJob
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-401
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <TimerJob scheduler_id="scheduler_4444" /> ... <TimerJob scheduler_id="scheduler_4444" name="test/my_.*" /> ... <TimerJob name="test/my_job"/> ... |
SystemMonitorNotification / Timer / TimerJobChain
TimerJobChain
supports the following attributes:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <TimerJobChain scheduler_id="scheduler_4444" /> ... <TimerJobChain scheduler_id="scheduler_4444" name="test/my_.*" /> ... <TimerJobChain name="test/my_jobchain" step_from="200"/> ... <TimerJobChain name="test/my_jobchain" step_to="500"/> ... <TimerJobChain name="test/my_jobchain" step_from="300" step_to="300"/> ... |
SystemMonitorNotification / Timer / Minimum
The following elements must be nested inside a Minimum
element:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Timer name="my_timer"> ... <Minimum><Script language="javascript"><![CDATA[1000]]></Script></Minimum> </Timer> ... |
SystemMonitorNotification / Timer / Maximum
The following elements must be nested inside a Maximum
element:
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Timer name="my_timer"> ... <Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum> </Timer> ... |
SystemMonitorNotification / Timer / Minimum|Maximum / Script
Script
supports the following attributes:
...
- a fixed value
- a calculation based on the job/order parameters
Fixed value
A fixed value is the time allowed in seconds for the specific Minimum
or Maximum
definition
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Script language="javascript"><![CDATA[1000]]></Script> ... |
Calculation
The calculation is to result in the time in seconds for the specific Minimum
or Maximum
definition.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<?xml version="1.0" encoding="ISO-8859-1"?> <job title="Sample Job with Store Result Monitor" order="yes" stop_on_error="no" tasks="1"> <params> <!-- set the scheduler_notification_result_parameters parameter --> <param name="scheduler_notification_result_parameters" value="file_size"/> </params> <!-- calculate and create the new order parameter if necessary --> <script language="java:javascript"><![CDATA[ function spooler_process(){ var order = spooler_task.order; var params = spooler.create_variable_set(); params.merge(spooler_task.params); params.merge(order.params); // parameter scheduler_file_path was set in the previous job chain step var file = new java.io.File(params.value("scheduler_file_path")); var fileSize = file.length()/1024; order.params.set_var("file_size",fileSize.toString()); return true; }]]> </script> <!-- set the StoreResultsJobJSAdapterClass as a monitor --> <monitor name="notification_monitor" ordering="1"> <!-- JobScheduler version 1.9.x, 1.10.x <script java_class="com.sos.scheduler.notification.jobs.result.StoreResultsJobJSAdapterClass" language="java"/> --> <!-- JobScheduler version 1.11.x, 1.12.x --> <script java_class="com.sos.jitl.notification.jobs.result.StoreResultsJobJSAdapterClass" language="java"/> </monitor> <run_time /> </job> |
Message
Anchor | ||||
---|---|---|---|---|
|
Usage
The Message can be configured on the following parent nodes as a CDATA element :
...
Example: <![CDATA[ scheduler id = ${MON_N_SCHEDULER_ID} ]]>
Variables
All variables (except OS environment variables) must be defined by using of the
...
- Table variables.
- Service variables.
- JOC Cockpit variables.
- OS environment variables.
Table variables
Expand | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table of the history of steps of processed orders / jobs.
|
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
timer name = ${MON_C_NAME}, text = ${MON_C_CHECK_TEXT} |
Service variables
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
service name = ${SERVICE_NAME} |
JOC Cockpit variables
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1388
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a> <a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a> <a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}</a> |
OS environment variables
All existing OS environment variables can be defined by message using the syntax %<variable name>%
(Windows) or $<variable name>
(Unix)
.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
%TEMP%/test.exe |
Notification environment variables
The default SystemNotifierProcessBuilderPlugin
plugin used by the SystemMonitorNotification / Notification / NotificationCommand
element sets the following variables as environment variables:
...
These variables can be used when the NotificationCommand calls the notification client - not directly but via a shell script that makes the logical implementation for sending the notification messages.
Table variables
Expand | ||
---|---|---|
| ||
All table variables (see
e.g.:
|
Service variables
Expand | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
|
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
1) configured command in the SystemMonitorNotification_<MonitorSystem>.xml file <NotificationCommand><![CDATA[C:/Temp/command.cmd]</NotificationCommand> 2) content of the C:/Temp/command.cmd file rem Note: "> C:/Temp/command_output.txt" is used to simulate the starting of the notification client rem echo %SCHEDULER_MON_SERVICE_NAME%:%SCHEDULER_MON_SERVICE_STATUS%:%SCHEDULER_MON_SERVICE_MESSAGE_PREFIX% history id = %SCHEDULER_MON_TABLE_MON_N_ORDER_HISTORY_ID% > C:/Temp/command_output.txt |
Examples
Anchor | ||||
---|---|---|---|---|
|
Message on error
Code Block | ||
---|---|---|
| ||
scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step=${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} |
Message on success
Code Block | ||
---|---|---|
| ||
scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), steps(${MON_SN_STEP_FROM} to ${MON_SN_STEP_TO}), order time elapsed = ${MON_N_ORDER_TIME_ELAPSED}s |
Message on performance check (Timer)
Code Block | ||
---|---|---|
| ||
name = ${MON_C_NAME}, scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), steps(${MON_C_STEP_FROM} to ${MON_C_STEP_TO}), check = ${MON_C_CHECK_TEXT} |
Examples System Monitoring
Anchor example_op5 example_op5
NotificationInterface ( Nagios / OP5 )
example_op5 | |
example_op5 |
The following is an except from an XML file used to notify a specific System Monitor (op5 Monitor) via the NotificationInterface:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <!-- monitor_host The hostname or ip address of System Monitor host monitor_port The TCP port that the System Monitor would listen to monitor_encryption Encryption algorithm service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor {MON_N_SCHEDULER_ID} See explanation "Table variables" ... --> <NotificationInterface monitor_host="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[ scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step =${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} ]]></NotificationInterface> ... |
NotificationCommand ( Nagios / OP5 )
The following is an except from an XML file used to notifying a specific System Monitor (op5 Monitor) via the NotificationCommand:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <!-- service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor. monitor_host The hostname or ip address of System Monitor host. {SERVICE_NAME} See explanation "Service variables" {SERVICE_STATUS} See explanation "Service variables" {SERVICE_MESSAGE_PREFIX} See explanation "Service variables" {MON_N_SCHEDULER_ID} See explanation "Table variables" ... NotificationCommand after substitution (error case): <![CDATA[echo service_host:JobScheduler Monitoring Errors:2:ERROR scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]> NotificationCommand after substitution (recovery case): <![CDATA[echo service_host:JobScheduler Monitoring Errors:0:RECOVERED scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]> NotificationCommand after substitution (success case): <![CDATA[echo service_host:JobScheduler Monitoring Success:0:SUCCESS scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error= | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]> --> <NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors" service_name_on_success="JobScheduler Monitoring Success" service_status_on_error="2" service_status_on_success="0"> <NotificationCommand><![CDATA[echo service_host:${SERVICE_NAME}:${SERVICE_STATUS}:${SERVICE_MESSAGE_PREFIX} scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step=${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]> </NotificationCommand> </NotificationMonitor> ... |
NotificationCommand ( Nagios / Opsview )
The following is an except from an XML file used to notifying a specific System Monitor (Opsview Monitor) via the NotificationCommand on Unix:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <!-- service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor. e.g- localhost monitor_host The hostname or ip address of System Monitor host. {SERVICE_NAME} See explanation "Service variables" {SERVICE_STATUS} See explanation "Service variables" {SERVICE_MESSAGE_PREFIX} See explanation "Service variables" {MON_N_SCHEDULER_ID} See explanation "Table variables" ... NotificationCommand after substitution (error case): <![CDATA[echo -e "localhost\tJobScheduler Monitoring Errors\t2\tERROR scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]> NotificationCommand after substitution (recovery case): <![CDATA[echo -e "localhost\tJobScheduler Monitoring Errors\t0\tRECOVERED scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]> NotificationCommand after substitution (success case): <![CDATA[echo -e "localhost\tJobScheduler Monitoring Success\t0\tSUCCESS scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]> --> <NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors" service_name_on_success="JobScheduler Monitoring Success" service_status_on_error="2" service_status_on_success="0"> <NotificationCommand><![CDATA[echo -e "service_host\t${SERVICE_NAME}\t${SERVICE_STATUS}\t${SERVICE_MESSAGE_PREFIX} scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step=${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT}\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]> </NotificationCommand> </NotificationMonitor> ... |
Anchor example_zabbix example_zabbix
NotificationCommand ( Zabbix )
example_zabbix | |
example_zabbix |
The following is an except from an XML file used to notify a specific System Monitor (Zabbix Monitor) and using NotificationCommand
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <!-- zabbix_sender Zabbix sender installed on the JobScheduler host localhost Hostname of the zabbix server Zabbix_server JobScheduler Agent name(host name) that registred on Zabbix samples.job1 Item key of zabbix (replace "/" to "." of JOB_NAME ${MON_N_ERROR_TEXT} See explanation "Table variables" --> <NotificationCommand> <![CDATA[zabbix_sender -z localhost -s zabbix_server -k samples.job1 -o ${MON_N_ERROR_TEXT}]]> </NotificationCommand> ... |
Examples Mail
NotificationMail content_type="text/html"
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <NotificationMail content_type="text/html" charset="ISO-8859-1" encoding="7bit" priority="Normal"> <From><![CDATA[jobscheduler@sos-berlin.com]]></From> <To><![CDATA[spam@sos-berlin.com]]></To> <Subject><![CDATA[JobScheduler notification: ${SERVICE_MESSAGE_PREFIX}, job chain executed with errors: ${MON_N_JOB_CHAIN_NAME}]]></Subject> <Body><![CDATA[<style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;border-color:#bbb;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#594F4F;background-color:#E0FFEB;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#493F3F;background-color:#9DE0AD;} </style> <table class="tg"> <tr> <th colspan="4">Error</th> </tr> <tr> <td>Code:</td><td>${MON_N_ERROR_CODE}</td> <td>Messsage</td><td>${MON_N_ERROR_TEXT}</td> </tr> <tr> <th colspan="4">JobScheduler</th> </tr> <tr> <td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td> <td>Agent URL</td><td>${MON_N_AGENT_URL}</td> </tr> <tr> <th colspan="4">Order</th> </tr> <tr> <td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td> <td>Order Title</td><td>${MON_N_ORDER_TITLE}</td> </tr> <tr> <td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td> <td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td> </tr> <tr> <td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td> <td>Job Title</td><td>${MON_N_JOB_TITLE}</td> </tr> <tr> <th colspan="4">Order History</th> </tr> <tr> <td>Time elapsed</td><td>${MON_N_ORDER_TIME_ELAPSED}</td><td> </td><td> </td> </tr> <tr> <td>Start Time</td><td>${MON_N_ORDER_START_TIME}</td> <td>End Time</td><td>${MON_N_ORDER_END_TIME}</td> </tr> <tr> <th colspan="4">Order Step History</th> </tr> <tr> <td>State</td><td>${MON_N_ORDER_STEP_STATE}</td> <td>Time elapsed</td><td>${MON_N_ORDER_STEP_TIME_ELAPSED}</td> </tr> <tr> <td>Start Time</td><td>${MON_N_ORDER_STEP_START_TIME}</td> <td>End Time</td><td>${MON_N_ORDER_STEP_END_TIME}</td> </tr> </table>]]></Body> </NotificationMail> ... |
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <NotificationMail content_type="text/html" charset="ISO-8859-1" encoding="7bit" priority="Normal"> <From><![CDATA[jobscheduler@sos-berlin.com]]></From> <To><![CDATA[spam@sos-berlin.com]]></To> <Subject><![CDATA[JobScheduler notification: job successfully completed: ${MON_N_JOB_NAME}]]></Subject> <Body><![CDATA[<style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;border-color:#aaa;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#333;background-color:#fff;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#fff;background-color:#f38630;} </style> <table class="tg"> <tr> <th colspan="4">JobScheduler</th> </tr> <tr> <td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td> <td>Agent URL</td><td>${MON_N_AGENT_URL}</td> </tr> <tr> <th colspan="4">Order</th> </tr> <tr> <td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td> <td>Order Title</td><td>${MON_N_ORDER_TITLE}</td> </tr> <tr> <td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td> <td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td> </tr> <tr> <td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td> <td>Job Title</td><td>${MON_N_JOB_TITLE}</td> </tr> <tr> <th colspan="4">Task History</th> </tr> <tr> <td>Task ID</td><td>${MON_N_TASK_ID}</td> <td>Time elapsed</td><td>${MON_N_TASK_TIME_ELAPSED}</td> </tr> <tr> <td>Start Time</td><td>${MON_N_TASK_START_TIME}</td> <td>End Time</td><td>${MON_N_TASK_END_TIME}</td> </tr> </table>]]></Body> </NotificationMail> ... |
JobScheduler - Store parameters to database
The Monitoring Interface provide functionality to store the job/order parameters of the specific jobs into database (table SCHEDULER_MON_RESULTS
).
See explanation : Calculation
JobScheduler - Job Chains
The following job chains are provided and should be configured accordingly:
sos / notification / CheckHistory (JobScheduler version 1.9.x, 1.10.x)
See <scheduler_install>/jobs/JobSchedulerNotificationCheckHistoryJob.xml
- This is the main job that analyze the JobScheduler history tables und write results into the notification tables.
- Job read all history entries for the job chains, configured in the
SystemMonitorNotification
XML files. - Job execute the performance checks for the defined
Timers
- Job read all history entries for the job chains, configured in the
- Order
Check
- configure repeat interval for order run time, e.g. every two minutes.
sos / notification / CheckHistory (JobScheduler version 1.11.x, 1.12.x)
- Job chain removed
Set param
sos.use_notification true (config/scheduler.xml)
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <spooler> <config ...> <params> ... <param name="sos.use_notification" value="true"/> ... </spooler> |
sos / notification / SystemNotifier
See <scheduler_install>/jobs/JobSchedulerNotificationSystemNotifierJob.xml
- Sends notifications to a specific System Monitor.
- Order
MonitorSystem
- JobScheduler version 1.9.x, 1.10.x
- configures a repeat interval for the order run time that is not less than the interval that has been chosen for triggering the job chain
sos/notification/CheckHistory
- configures a repeat interval for the order run time that is not less than the interval that has been chosen for triggering the job chain
- JobScheduler version 1.9.x, 1.10.x
sos / notification / CleanupNotifications
See <scheduler_install>/jobs/JobSchedulerNotificationCleanupNotificationsJob.xml
- Removes notifications that have expired.
- Order
Cleanup
- configure start time for order run time, e.g. 24:00
sos / notification / ResetNotifications
See <scheduler_install>/jobs/JobSchedulerNotificationResetNotificationsJob.xml
- Some System Monitors may provide an "acknowledge" operation, that signaling has known problem.
- Should an "acknowledge" operation have been performed for a specific service in the System Monitor then job chain
ResetNotifications
would stop JobScheduler from sending notifications for that service for errors that have already occurred. - Do not configure the order run time for this job chain, as job chain will be triggered by the System Monitor's "acknowledge" operation via add_order XML command.
Examples
Example ResetNotifications <add_order> XML command
The following example shows the XML command sent from a monitoring system to the JobScheduler to call the sos/notification/ResetNotifications
job chain and set the relevant service name as acknowledged.
...
Element | Attribute | Value | Description | |
---|---|---|---|---|
add_order | XML Command to add the new order to the specified job chain on the JobScheduler. | |||
job_chain | sos/notification/ResetNotifications | Job chain path must correspond with the path of the ResetNotifications job chain installed on the JobScheduler. | ||
id | Order identifier. | |||
title | Order title. | |||
param | 3 following parameters must be set: | |||
name | service_name | JobScheduler Monitoring Error | Relevant service name to set all already occured service errors in JobScheduler Interface Monitor as acknowledged. | |
name | system_id | op5 | System identification. Corresponds with | |
name | operation | acknowledge | Fixed value. Operation name to execute the acknowledgement in the JobScheduler Monitoring Interface. |
Example ResetNotifications <add_order> XML command via Perl script for op5 monitor system
This example shows the integration of a Perl script into op5 monitor system that automatically sends the above XML command to the JobScheduler sos/notification/ResetNotifications
job chain.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/usr/bin/perl -w use strict; use LWP::UserAgent; use HTTP::Request::Common; use Getopt::Long; use vars qw($opt_H $opt_f $opt_s $opt_p $opt_t $opt_h); use vars qw(%ERRORS &support); my $host; my $type; my $service; my $port; my $timeout = 30; our %ERROR; %ERRORS = ( 'OK' => 0, 'CRITICAL' => 2, 'ERROR' => 2, 'UNKNOWN' => 9, 'WARNING' => 1, ); sub print_help (); sub print_usage (); Getopt::Long::Configure('bundling'); GetOptions ("h" => \$opt_h, "help" => \$opt_h, "H=s" => \$opt_H, "hostname=s" => \$opt_H, "f=s" => \$opt_f, "s=s" => \$opt_s, "service=s" => \$opt_s, "t=i" => \$opt_t, "timeout=i" => \$opt_t, "p=i" => \$opt_p, "port=i" => \$opt_p); if($opt_h) {print_help(); exit 0;} if($opt_H ) { if ( $opt_H =~ /([-.A-Za-z0-9]+)/ ) { $host = $opt_H; } ($host) || print("Invalid host: $opt_H\n"); } else{ print("Host name/address not specified\n");} if($opt_p ) { if ($opt_p =~ /([0-9]+)/) { $port = $1 if ($opt_p =~ /([0-9]+)/);} ($port < 0 || $port > 65535) && print("Invalid Port: $opt_p\n"); } else{ print("Port not specified\n");} if ($opt_t) { $timeout = $opt_t; } if( !$host || !$port ) { print_usage(); exit 1;} #<add_order job_chain ="/sos/notification/ResetNotifications" # id ="op5 JobScheduler Monitoring Error acknowledgement" # title ="op5 JobScheduler Monitoring Error acknowledgement"> # <params> # <param name="service_name" value="JobScheduler Monitoring Error" /> # <param name="system_id" value="op5"/> # <param name="operation" value="acknowledge" /> # </params> #</add_order> my $message = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?><add_order job_chain=\"/sos/notification/ResetNotifications\" id=\"op5 ".$opt_s." Acknowledegment\" title=\"op5 ".$opt_s." Acknwoledgement\"><params><param name=\"system_id\" value=\"MonitorSystem\"/><param name=\"service_name\" value=\"".$opt_s."\"/><param name=\"operation\" value=\"acknowledge\"/></params></add_order>"; if($opt_f=~m/ACKNOWLEDGEMENT/){ send_request($message); } else{ print("Please set notification type to ACKNOWLEDGEMENT\n");} sub send_request { my $message = shift; my $userAgent = LWP::UserAgent->new(agent => 'perl post'); $userAgent->timeout($timeout); my $response = $userAgent->request(POST 'http://'.$host.':'.$port,Content_Type => 'text/xml',Content => $message); if ($response->is_success) { _report('OK', "OK: Service name: ".$opt_s."\nNotification type: ".$opt_f."\nRequest: ". $message."\n\nAnswer:\n".$response->as_string."\n"); } else { _report('ERROR',"ERROR: Service name: ".$opt_s."\nNotification type: ".$opt_f."\nRequest: ". $message."\n\nAnswer:\n".$response->error_as_HTML."\n"); } } sub get_attribute_value { my ($attr_name, $elem_xml) = @_; $elem_xml =~ s/.*$attr_name\s*=\s*\"(.*?)\".*/$1/s; return $elem_xml; } sub get_state_elem { my $xml = shift; $xml =~ s/.*<spooler.*?>\s*<answer.*?>\s*(<state.*?>).*/$1/s; return $xml; } sub print_help () { print $0. "\n"; print "Copyright (c) 2015 SOS GmbH, info\@sos-berlin.com This script tries to connect to given Job Scheduler "; print_usage(); print " -H, --hostname=HOST Name or IP address of host to check -p, --port=INTEGER Port at host to check -t, --timeout=INTEGER Timeout for HTTP connetion -f =STRING Notification type, e.g. ACKNOWLEDGEMENT -s, --service=STRING Service name, e.g. JobScheduler Errors -h, --help This help "; } sub print_usage () { print "Usage: $0 -H <host> -p <port> -f ACKNOWLEDGEMENT -s <service name> [-t <timeout>]\n"; } sub _report { print $_[1]; if (defined($ERRORS{$_[0]})) { exit $ERRORS{$_[0]}; } else { exit 0; } } |
JobScheduler - Job Chains customization
The default name of the monitor system used in the configuration files and stored in the JobScheduler database is "MonitorSystem".
...
<scheduler_install>/config/notification/SystemMonitorNotification_MonitorSystem.xml
- rename this file to
SystemMonitorNotification_op5.xml
- set
system_id
attribute toop5
e.g. <SystemMonitorNotification system_id="op5">
- rename this file to
<scheduler_install>/config/live/sos/notification/SystemNotifier,MonitorSystem.order.xml
- rename this file to
SystemNotifier,op5.order.xml
- set
system_configuration_file
attribute toSystemMonitorNotification_op5.xml
e.g.
<param name="system_configuration_file" value="config/notification/SystemMonitorNotification_op5.xml"/>
- rename this file to
-
<scheduler_install>/config/live/sos/notification/ResetNotifications,AcknowledgeMonitorSystem.order.xml
- rename this file to
ResetNotifications,Acknowledgeop5.order.xml
- set
system_id
op5
e.g.
<param name="system_id" value="op5"/>
- rename this file to
JobScheduler - Cluster
In case of Cluster Operation please modify the job_chain
element definition for all notification job chain files
...
CheckHistory.job_chain.xml
CleanupNotifications.job_chain.xml
ResetNotifications.job_chain.xml
SystemNotifier.job_chain.xml
Use Cases
Workflow Execution takes too long
Initial Situation
A Job Chain is triggered and it could not end, it hang in a step, taking longer than expected.
Problem
Execution time was too long
Handling
A timer for this Job Chain has been set and the System Monitor notified about it. The expiration times for the Job Chains are configured with enough time for processing. This is usually used for cases where the Job Chain could hang in a specific step.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Timer
- Configure SystemMonitorNotification / Notification / NotificationObjects / TimerRef
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the
SFTP connection refused
Initial Situation
Consider a Job Chain that uses SFTP for transferring files. You have a setback configured in this step of the Job Chain, so that if the connection to the SFTP server fails, this step is retried after a specified time.
Problem
The SFTP server is not available anymore.
Handling
The System Monitor will be notified to the service related to the Job Chain with the message error. However, you don't want to have repeated notifications for a Job Chain when is an external factor, the connection to the SFTP Server, is producing the error.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain.
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the
Thresholds
Initial Situation
Consider the situation where a workflow has to be executed successfully a specific number of times before a specific point in time. This means that a specific value has to be monitored in order to determine if this quote was reached.
Handling
A new History service is configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they have been successfully executed to the System Monitor.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain
- Configure
service_name_on_success
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_success
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the
Acknowledgment
Initial Situation
An alert for a Service has been sent to the System Monitor, which has sent a Mail to the Service Desk (Support Team) notifying them about the alert.
Handling
The problem is known to the Service Desk and they "acknowledge" the problem. The acknowledgment will cause the JobScheduler to be notified not to send any more notifications for this Service to the System Monitor until the Service has been recovered.
Configuration
System Monitor
- The JobScheduler is notified about the acknowledgment in the System Monitor by the execution of a script. See sos / notification / ResetNotifications
Recoverable Errors
Initial Situation
You have a setback configured in one of the steps of the Job Chain, so that if the step execution fails, this step is retried after a specified time.
Problem
The step has ended with an error, but recovered after setback
Handling
If the error message has been sent to the System Monitor, in case of error recovery JobScheduler will automatically sent the recovery message on the same service with the same error message and the prefix RECOVERED.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain.
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the
Change Management References
Jira | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
...