Table of Contents |
---|
...
Introduction
This solution is about monitoring JobScheduler and its objects such as Jobs, Job Chains and Orders. Here you get an overview of how JobScheduler monitoring works. This feature will be available starting from general availability release 1.8.
These are some of the features of the architecture of this solution:
- JobScheduler: The architecture establishes a partition between:
- Detecting errors: A Job Chain analyses the JobScheduler logging and checks whether the monitored Job Scheduler objects had errors or warnings.
- Sending alerts: Another Job Chain is responsible for sending the alerts to the corresponding System Monitor. The difference here, is that not all alerts are only incidents, but also events, as in occurrences, for example, the alert that a specific Job Chain was executed and which result it ended up with.
- JobScheduler: This architecture allows to analyse the Log History of more than one JobScheduler.
- System Monitor: JobScheduler is able to connect to more than one System Monitor at the same time.
Definitions
Definition | Description |
---|---|
System Monitor | A System Monitor is an instrument to inform the Service Desk (1st Level Support) about incidents in IT systems. It does not serve for the analysis of the incidents, but merely for the information about the incidents, in order to be able to forward and scale these informations |
Passive Checks | These kind of checks are the ones that are sent remotely from an external host (from the point of view of a System Monitor) to the System Monitor. Otherwise, the ones that are carried out periodically by the System Monitor are called active checks. |
Alerting | An Alert is an alarm, i.e. the message about an event. An alert does not provide every relevant information of an event, but it informs about the existence of the event. An alert can be either positive or negative. |
Notification | The notification of a specific alert. Not every alert will be notified, just the ones that are so configured will be notified. Notifications are therefore a subset of the alerts and can be either positive or negative too. |
Acknowledgement | Is the confirmation of an alert and it has the meaning, that the alert has been seen and/or is well known and the incident is trying to be recovered. An acknowledgement is always manually executed, that means, there is always someone that has realized there is a Critical service and this person acknowledges the services (usually by the Service Desk or 1st Level Support). It is never an automatized step. |
Benefits
The benefits of the new solution are:
- There is no changes to be done in your JobScheduler configuration (Jobs, Job Chains, etc.) in order to get this solution working. You have to add the corresponding Job Chains for the monitoring but do not have to modify your current ones.
- The whole architecture lies at JobScheduler side and the solution is then independent from the monitor that the alerts are sent to. The solution works for every monitor that can receive passive checks.
- Processing of Jobs and Job Chains in JobScheduler is not affected or modified by the monitoring, neither in sense of performance nor in sense of stability.
- The level of detail in a message of a Service in the System Monitor is much higher with this solution. JobScheduler logs very exact what the error is about and this information is sent as a passive check to the specific Service, which shows the log message that JobScheduler logged.
- The criticality of an error is immediately recognized in the System Monitor. JobScheduler has all information about errors and this information is sorted out and sent to different Services in the System Monitor for every specific case. Through this feature, the Service Desk is immediately able to set its priority for recovering errors. For example, it does not have the same Criticality to recover an error of Performance (low) than when Documents could not be generated (high). Here you go a representation of this feature:
Functionality
Functionality | Description |
---|---|
Job Chain and Order Monitoring | Job Chains in JobScheduler can be with the new solution monitored. Actually, the elements that are monitored are the Orders that trigger these Job Chains. |
History Notifications | Not only critical alerts are monitored, but also the positive ones. The history of a specific service is also monitored, to see exactly if a specific workflow was executed or not and what result it ended up with. |
Performance measurement (Timer) | There are also Timers that measure the performance of a Job Chain. In case it takes too long for a Job Chain to end, a critical alert will be sent to a System Monitor. |
Acknowledgment | Once a service in the System Monitor is critical, there is the possibility to acknowledge this service. That action will add an Order to the JobScheduler, so that JobScheduler does not send more notifications to the System Monitor for this service. |
Installation
See https://kb.sos-berlin.com/x/fYEm
Configuration
JobScheduler - SystemMonitorNotification files
Location: <scheduler_install>/config/notification
File | Description |
---|---|
SystemMonitorNotification_v1.0.xsd | XML Schema file that define which values are allowed in your XML files for the JobScheduler monitoring. That means, you just have to modify your |
SystemMonitorNotification_<MonitorSystem>.xml | Configuration file for each System Monitor.
|
| Configuration file for all System Monitors.
This file is optional and must contains only the definitions of the |
SystemMonitorNotification Elements
The configuration element descriptions are organized into the following major categories:
Element | Element description | Description |
---|---|---|
SystemMonitorNotification | Top Level Element | Configuration for Notifications to a System Monitor |
Notification | Once or more inside a SystemMonitorNotification element | Specifies a System Monitor notification that includes a command line invocation and the JobScheduler objects |
Timer | Optional, once or more inside a SystemMonitorNotification element | Performance measurement definition |
SystemMonitorNotification
SystemMonitorNotification
support the following attributes:
Note:
- attribute
system_id
in case of the
SystemMonitorNotificationTimers.xml
the value of this attribute is not important and can have any value.
e.g.:
timers
Attribute | Usage | Description |
---|---|---|
system_id | required | System Monitor identifier. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="OP5">
...
|
SystemMonitorNotification / Notification
The following elements may be nested inside a Notification
element:
Element | Element description | Description |
---|---|---|
NotificationMonitor | Once inside a Notification element | Specifies the System Monitor interface that is being used for messages: either by a Plugin Interface or by command line invocation |
NotificationObjects | Once inside a Notification element | Specifies the JobChains and the Timers definitions |
SystemMonitorNotification / Notification / NotificationMonitor
NotificationMonitor
support the following attributes:
Note:
- attributes
service_name_on_error
andservice_name_on_success
- at least one of these attributes must be configured
- both attributes can be configured together
Attribute | Usage | Description |
---|---|---|
service_name_on_error | Optional | This setting specifies the service that is configured in the Service Monitor for messages of job runs with errors and for job recovery messages. The service name must match the corresponding setting in the System Monitor. |
service_name_on_success | Optional | This setting specifies the service that is configured in the Service Monitor for receiving informational messages on successful job runs. The service name must match the corresponding setting in the System Monitor |
service_status_on_error | Optional | This setting specifies the service status code for error messages. Default: |
service_status_on_success | Optional | This setting specifies the service status code for success messages Default: |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<!--
Example OP5
NSCA Status:
0 - OK
1 - WARNING
2 - CRITICAL
3 - UNKNOWN
-->
...
<!--
Sending occurred errors as CRITICAL (default)
-->
<NotificationMonitor service_name_on_error="JobScheduler Errors">
...
<!--
Sending occurred errors as WARNING
-->
<NotificationMonitor service_name_on_error="JobScheduler Errors" service_status_on_error="1">
... |
One of the following elements must be nested inside a NotificationMonitor
element:
Element | Element description | Description |
---|---|---|
NotificationInterface | Optional, once inside of NotificationMonitor element | Plugin Interface to be executed for System Monitor notification |
NotificationCommand | Optional, once inside of NotificationMonitor element | Command line to be executed for System Monitor notification |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationInterface
NotificationInterface
support the following attributes:
...
This setting specifies the password configured in the ncsa.cfg file used by NSCA.
...
This setting specifies the connection timeout in ms.
Default: 5000
...
This setting specifies that the communication with the System Monitor is encrypted. By default no encryption is used.
NONE
- no encryptionXOR
- XOR encryptionTRIPLE_DES
- use of triple des algorithm for encryption
...
article describes individual configuration parameters and provides examples of their use with monitors such as op5 and Zabbix and using of the mail und JMS interfaces.
Download: notification.xml
Send notifications
Notify on error
is configured.SystemMonitorNotification / Notification /
NotificationMonitor / @service_name_on_error
SystemMonitorNotification / Notification / NotificationObjects / JobChain
Error messages
- will be sent:
- when at the time of the sos / notification / SystemNotifier run, an order is in a job chain state(step) that has ended with an error
- will not be sent:
- when after the last run of the sos / notification / SystemNotifier, an error has occured in an job chain state(step) but at the time of the current sos / notification / SystemNotifier run this order is in an other/next job chain state(step)
- this kind of error is ignored because the order has continued to run
- when an error has reoccurred in the same job chain state(step) where a notification has already been sent
- this order state is considered as notified and no new notification will be sent
- e.g. an job chain state(step) has been restarted manually or by a setback
- this behaviour has been changed with
- providing support for repeatedly failed executions.Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-534
- this order state is considered as notified and no new notification will be sent
- when the first step of the specific order has been removed from the notification tables by sos / notification / CleanupNotifications
- this behaviour has been changed with
- providing support for long running ordersJira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-516
- this behaviour has been changed with
- when a notification maximum has been reached
- when a job chain state(step) has been configured as excluded
- when the
@step_from
or@step_to
settings have been configured and the job chain state(step) is out of the configured range - when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job chain state(step) is out of the configured range
- when after the last run of the sos / notification / SystemNotifier, an error has occured in an job chain state(step) but at the time of the current sos / notification / SystemNotifier run this order is in an other/next job chain state(step)
Recovery messages
- will be automatically sent using the same service name and message as the relevant error message:
- when the error message of a job chain state(step) has already been sent and the order at the time of the current sos / notification / SystemNotifier run is in an other/next state(step)
- e.g. the rerun of the error state(step) has been successfull and the order has been moved to the next job chain state(step)
- Note: use
${SERVICE_STATUS}
and${SERVICE_MESSAGE_PREFIX}
variables to differentiate between recovery and error message
- when the error message of a job chain state(step) has already been sent and the order at the time of the current sos / notification / SystemNotifier run is in an other/next state(step)
- will not be sent:
- when a job chain state(step) has recovered after the last run of the sos / notification / SystemNotifier but at the time of the current sos / notification / SystemNotifier run a new error in the other/next step has occured
SystemMonitorNotification / Notification / NotificationObjects / Job
Error messages
- will be sent:
- when a job chain state(step) or standalone job (JobScheduler versions from 1.12) ends with an error
- will not be sent:
- when a notification maximum has been reached
- when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job is out of the configured range
Notify on success
is configured.SystemMonitorNotification / Notification /
NotificationMonitor / @service_name_on_success
SystemMonitorNotification / Notification / NotificationObjects / JobChain
Success messages
- will be sent:
- when an order is completed and the last job chain state(step) has no error
- will not be sent:
- when a notification maximum has been reached
- when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job chain state(step) is out of the configured range
SystemMonitorNotification / Notification / NotificationObjects / Job
Success messages
- will be sent:
- when a job chain state(step) or standalone job (JobScheduler releases from 1.12) ends without error
- will not be sent:
- when a notification maximum has been reached
- when the
@return_code_from
or@return_code_to
settings have been configured and the return code of the job is out of the configured range
Configuration Editor
We recommend that the XML Editor is used generate monitoring configuration objects. This editor automatically uses an XSD Schema
to generate configuration suggestions and validate configurations, and its use is intended to provide a significant reduction in the time required to develop and test a configuration.
XSD Schema locations
- https://www.sos-berlin.com/schema/jobscheduler/SystemMonitorNotification_v1.0.xsd
- JobScheduler releases before 1.13.1
<scheduler_data>/config/notification
/SystemMonitorNotification_v1.0.xsd
- JobScheduler releases starting from 1.13.1
<scheduler_data>/config/live/sos/.configuration/notification/SystemMonitorNotification_v1.0.xsd
Configuration
JobScheduler
Activation of Monitoring Interface
- JobScheduler releases before 1.11
- JobScheduler releases starting from 1.11
- Set param
sos.use_notification true (config/scheduler.xml)
- see JobScheduler - Job Chains
- Set param
Note:
- JobScheduler releases before 1.13.1
file(s) (see below) must be configured before activtion.SystemMonitorNotification_<MonitorSystem>.xml
- JobScheduler releases starting from 1.13.1
SystemMonitorNotification_<MonitorSystem>.xml
file(s) (see below) must be configured before activtion or- The NOTIFICATION configuration was forwarded by the Joc Cockpit to the respective JobScheduler Master.
SystemMonitorNotification files
JobScheduler releases before 1.13.1
Location: <scheduler_data>/config/notification
File | Description |
---|---|
SystemMonitorNotification_v1.0.xsd | The XML Schema file defines which values are allowed in your XML files for the JobScheduler monitoring. That means that to configure the JobScheduler objects you want to monitor and the System Monitor you just have to modify your |
SystemMonitorNotification_<MonitorSystem>.xml | Configuration file for each System Monitor.
|
| Configuration file for all System Monitors.
This file is optional and contains the definitions of the |
JobScheduler releases starting from 1.13.1
Note: Usage of the configuration files in the <scheduler_data>/config/notification
folder has been deprecated.
Location: <scheduler_data>/config/live/sos/.configuration/notification
File | Description |
---|---|
SystemMonitorNotification_v1.0.xsd | The XML Schema file defines which values are allowed in your XML files for the JobScheduler monitoring. |
notification.xml | Configuration file for System Monitors:
|
SystemMonitorNotification Elements
The configuration element descriptions are organized into the following major categories:
Element | Element description | Description |
---|---|---|
SystemMonitorNotification | Top Level Element | Configuration for notifications to be sent to a system monitor. |
Notification | Required, multiple use allowed inside the SystemMonitorNotification element | Specifies a system monitor notification that includes a command line invocation and the JobScheduler objects. |
Timer | Optional or multiple use allowed inside the SystemMonitorNotification element | Performance measurement definition. |
SystemMonitorNotification
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
SystemMonitorNotification
supports the following attributes:
Attribute | JobScheduler release | Usage | Description |
---|---|---|---|
system_id | before 1.13.1 | required | System Monitor identifier. See JobScheduler - Job Chains customization Note:
|
starting from 1.13.1 | required |
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5">
...
|
SystemMonitorNotification / Notification
Notification
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
name | optional | Notification description |
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <NotificationInterface<Notification monitor_hostname="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[ scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step =%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT% ]]></NotificationInterface> ... |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationCommand
NotificationCommand
support the following attributes:
Attribute | Usage | Description |
---|---|---|
plugin | Optional | Default: com.sos.scheduler.notification.plugins.notifier.SystemNotifierProcessBuilderPlugin |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<NotificationCommand><![CDATA[
echo scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step =%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT% > D://errors.txt
]]></NotificationCommand>
...
|
SystemMonitorNotification / Notification / NotificationObjects
One of the following elements must be nested inside a NotificationObjects
element:
Element | Element description | Description |
---|---|---|
JobChain | Optional, once or more inside of NotificationObjects element | Restricts notifications for job chains |
Timer | Optional, once or more inside of NotificationObjects element | Restricts notifications for performance checks (Timer) |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="OP5">
<Notification>
<NotificationMonitor service_name_on_error="Errors">
...
</NotificationMonitor>
<NotificationObjects>
<!--
Send the job chain error, occurrent in the "test/my_jobchain" job chain, to the "Errors" service.
-->
<JobChain name="test/my_jobchain" />
</NotificationObjects>
</Notification>
</SystemMonitorNotification> |
SystemMonitorNotification / Notification / NotificationObjects / JobChain
JobChain
support the following attributes:
Attribute | Usage | Description |
---|---|---|
notifications | Optional Integer | Specifies the number of notifications that are sent to a System Monitor. Default: |
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that would log into the same database. Regular expression can be used. |
name | Optional | Job chain name including possible folder names. Regular expression can be used. |
step_from | Optional | Restricts notifications for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. |
step_to | Optional | Restricts notifications for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. |
excluded_steps | Optional | Specifies the steps which will be excluded from the analysing (separated by semicolon) |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<JobChain notifications="2" name="test/my_jobchain"/>
...
<JobChain scheduler_id="scheduler_4444" />
...
<JobChain scheduler_id="scheduler_4444" name="^(test/my)" />
...
<JobChain name="test/my_jobchain" step_from="200"/>
...
<JobChain name="test/my_jobchain" step_to="500"/>
...
<JobChain name="test/my_jobchain" step_from="300" step_to="300"/>
...
<JobChain name="test/my_jobchain" excluded_steps="200;300"/>
... |
SystemMonitorNotification / Notification / NotificationObjects / Timer
Timer support the following attributes:
...
Optional
Integer
...
Specifies the number of notifications that are sent to a System Monitor.
Default: 1
...
Mail: on failed job">
...
|
The following elements may be nested inside a Notification
element:
Element | Element description | Description |
---|---|---|
NotificationMonitor | Required, only once inside the Notification element | Specifies the System Monitor interface that is being used for messages: either by a Plug-in Interface or by command line invocation |
NotificationObjects | Required, only once inside the Notification element | Specifies the Job Chain and the Timer definitions |
SystemMonitorNotification / Notification / NotificationMonitor
The JobScheduler Interface Monitor can be used to monitor the messages for the 3 use cases:
- error case
- an error has occurred / been recovered during a job chain / job execution
- the
service_name_on_error
setting is responsible for this monitoring case
- the
- an error has occurred / been recovered during a job chain / job execution
- success case
- a job chain / job ends successfully
- the
service_name_on_success
setting is responsible for this monitoring case
- the
- a job chain / job ends successfully
- performance check (see
Timer
)- usually the
service_name_on_error
setting is responsible for this monitoring case but the performance check will also work if only theservice_name_on_success
setting has been defined.
- usually the
In addition, the service_name_on_error
/ service_name_on_success
attributes have the following meaning:
NotificationInterface
- The setting must match the corresponding service name in the System Monitor such as Nagios or op5.
NotificationCommand
- Freely selectable, has no further meaning than to identify a notification.
NotificationMail
- Freely selectable, has no further meaning than to identify a notification.
NotificationJMS
- The setting must match the corresponding queue/topic name in the JMS Server.
Note:
- attributes
service_name_on_error
andservice_name_on_success
- at least one of these attributes must be configured
- both attributes can be configured together
- the use of this settings must be unique within one SystemNotification
NotificationMonitor
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
service_name_on_error | Optional | See explanation above. |
service_name_on_success | Optional | See explanation above. |
service_status_on_error | Optional | This setting specifies the service status code for error messages. Default: |
service_status_on_success | Optional | This setting specifies the service status code for success messages Default: |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<!-- Example
op5 NSCA Status:
0 - OK
1 - WARNING
2 - CRITICAL
3 - UNKNOWN -->
...
<!-- Sending errors as CRITICAL (default) -->
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors">
...
<!-- Sending occurred errors as WARNING -->
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors" service_status_on_error="1">
... |
One of the following elements must be nested inside a NotificationMonitor
element:
Element | Element description | Description |
---|---|---|
NotificationInterface | Optional or only once inside the NotificationMonitor element | NSCA plug-in Interface to be executed for System Monitor notification |
NotificationCommand | Optional or only once inside the NotificationMonitor element | Command line to be executed for System Monitor notification |
NotificationMail | Optional or only once inside the NotificationMonitor element | Mail interface to be executed for System Monitor notification |
NotificationJMS | Optional or only once inside the NotificationMonitor element | JMS interface to be executed for System Monitor notification |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationInterface
NSCA plug-in Interface to be executed for System Monitor notification.
NotificationInterface
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
monitor_host | Required | This setting specifies the host name or IP address of the System Monitor host. |
monitor_port | Required | This setting specifies the TCP port that the System Monitor will listen to. |
monitor_password | Optional | This setting specifies the password
|
monitor_connection_timeout | Optional | This setting specifies the connection timeout in ms. Default: |
monitor_response_timeout | Optional | This setting specifies the response timeout in ms. |
monitor_encryption | Optional | This setting specifies that the communication with the System Monitor is encrypted. By default no encryption is used.
|
service_host | Required | This setting specifies the name of the host that executes the passive check. The name must match the corresponding setting in the System Monitor. |
plugin | Optional | Default:
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<NotificationInterface monitor_host="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[
scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step =${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT}
]]></NotificationInterface>
... |
Note | ||
---|---|---|
| ||
In case you are using OpsView as the monitoring tool, the plugin used in Instead, you should use the XML element |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationCommand
Command line to be executed for System Monitor notification.
NotificationCommand
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
plugin | Optional | Default:
|
...
Optional
Boolean
Send timer check notification when the configured job chain contains the error notifications.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="OP5">
<Notification>
<NotificationMonitor service_name_on_error="Errors">
...
</NotificationMonitor>
<NotificationObjects>
<!--
Send the job chain error, occurrent in the "test/my_jobchain" job chain, to the "Errors" service.
-->
<JobChain name="test/my_jobchain" />
</NotificationObjects>
</Notification>
<Notification>
<NotificationMonitor service_name_on_error="Performance">
...
</NotificationMonitor>
<NotificationObjects>
<!--
Send the performance check error, occurrent in the "test/my_jobchain" job chain, to the "Performance" service.
Send of the performance check error to the "Performance" service will be ignored when the "test/my_jobchain" has the job chain error (default notify_on_error = false).
-->
<Timer name="my_timer" />
</NotificationObjects>
</Notification>
<Timer name="my_timer">
<JobChain name="test/my_jobchain" />
</Timer>
</SystemMonitorNotification> |
SystemMonitorNotification / Timer
The following elements must be nested inside a Timer
element:
Element | Element description | Description |
---|---|---|
JobChain | Once or more inside of Timer element | Restricts notifications for job chains |
Minimum | Optional or once inside of Timer element | Minimum required time consumption for job or job chain execution. Allows script code to be executed that returns the minimum execution time required in seconds. |
Maximum | Optional or once inside of Timer element | Maximum allowed time consumption for job or job chain execution. Allows script code to be executed that returns the maximum execution time required in seconds. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="OP5">
...
<Timer name="my_timer_1">
<JobChain name="test/my_jobchain_1" />
<Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum>
</Timer>
<Timer name="my_timer_2">
<JobChain name="test/my_jobchain_2" />
<JobChain name="test/my_jobchain_3" />
<Minimum><Script language="javascript"><![CDATA[500]]></Script></Minimum>
<Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum>
</Timer>
</SystemMonitorNotification> |
Timer
support the following attributes:
Attribute | Usage | Description |
---|---|---|
name | Required | Correspondence to Timer used in the The name must be unique across all timers definitions. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Timer name="my_timer">
... |
SystemMonitorNotification / Timer / JobChain
JobChain
support the following attributes:
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that would log into the same database. Regular expression can be used. |
name | Optional | Job chain name including possible folder names. Regular expression can be used. |
step_from | Optional | Restricts checks for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. |
step_to | Optional | Restricts checks for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<JobChain scheduler_id="scheduler_4444" />
...
<JobChain scheduler_id="scheduler_4444" name="^(test/my)" />
...
<JobChain name="test/my_jobchain" step_from="200"/>
...
<JobChain name="test/my_jobchain" step_to="500"/>
...
<JobChain name="test/my_jobchain" step_from="300" step_to="300"/>
... |
SystemMonitorNotification / Timer / Minimum
The following elements must be nested inside a Minimum
element:
Element | Element description | Description |
---|---|---|
Script | Once inside of Minimum element | Script code in one of the supported languages |
...
<NotificationCommand><![CDATA[
echo scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step =${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} > D://errors.txt
]]></NotificationCommand>
...
|
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1388
The Mail interface to be executed for System Monitor notification.
The Mail interface reads the following values from the configuration files:
config/factory.ini
- Section
spooler
log_mail_from
log_mail_to
log_mail_cc
log_mail_bcc
smtp
mail_queue_dir
mail_queue_only
- Section
smtp
mail.smtp.user
mail.smtp.password
mail.smtp.port
mail.smtp.connectiontimeout
mail.smtp.timeout
- Section
config/private/private.conf
joc.url
NotificationMail
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
content_type | Optional | Content type of the e-mail. Possible values:
Default: |
charset | Optional | Charset of the e-mail. Default: |
encoding | Optional | Encoding of the e-mail. Possible values:
Default: |
priority | Optional | Priority of the e-mail. Possible values:
Default: |
plugin | Optional | Java class of the plugin implementation (extends Default: |
The following elements can be nested inside a NotificationMail
element:
Element | Element description | Description |
---|---|---|
From | Optional or only once inside of the NotificationMail element | E-mail address of the account that sends e-mail. |
To | Optional or only once inside of the element | E-mail address of the recipient(s) of a notification e-mail. |
CC | Optional or only once inside of the NotificationMail element | E-mail address of the recipient(s) of a carbon copy notification e-mail. |
BCC | Optional or only once inside of the NotificationMail element | E-mail address of recipient(s) of a blind carbon copy notification e-mail. |
Subject | Required, only once inside of the NotificationMail element | Subject of an e-mail notification. |
Body | Required, only once inside of the NotificationMail element | Body of an e-mail notification. |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / From
E-mail address of the account that sends the e-mail.
The mail notification interface uses the value of the log_mail_from
entry (configuration file config/factory.ini
) when this element is not set.
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / To
E-mail address of the recipient(s) of a notification e-mail.
When this element
- is not set
will be usedlog_mail_to
- is set
-
log_mail_to
, log_mail_cc, log_mail_bcc
are not used
-
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / CC
E-mail address of the recipient(s) of a carbon copy notification e-mail.
When this element
- is not set
will be used (if thelog_mail_cc
NotificationMail/To
element is not defined - see above)
- is set
-
log_mail_cc, log_mail_bcc
are not used
-
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / BCC
E-mail address of recipient(s) of a blind carbon copy notification e-mail.
When this element
- is not set
will be used (if thelog_mail_bcc
NotificationMail/To
orNotificationMail/CC
elements are not defined - see above)
- is set
-
log_mail_bcc
are not used
-
SystemMonitorNotification / Notification / NotificationMonitor / NotificationMail / Subject
Subject of an e-mail notification.
The Subject
can contain the JobScheduler Monitoring Interface variables.
Code Block | ||||||
---|---|---|---|---|---|---|
Code Block | ||||||
| ||||||
... <Timer name="my_timer"> ... <Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum> </Timer> ... <Subject><![CDATA[JobScheduler notification: ${SERVICE_MESSAGE_PREFIX}, job executed with errors: ${MON_N_JOB_NAME}]]></Subject> ... |
SystemMonitorNotification /
...
The following elements must be nested inside a Maximum
element:
...
Notification / NotificationMonitor / NotificationMail / Body
Body of an e-mail notification.
The Body
can contain the JobScheduler Monitoring Interface variables.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Timer name="my_timer">
...
<Minimum><Script language="javascript"><![CDATA[1000]]></Script></Minimum>
</Timer>
... |
SystemMonitorNotification / Timer / Minimum|Maximum / Script
Script
support the following attributes:
Attribute | Usage | Description |
---|---|---|
language | Required | Script language name Supported languages:
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
... <Script language="javascript"><![CDATA[1000]]></Script> ... <Script language="javascript"><![CDATA[ function calculate(){ <!DOCTYPE html> <html lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <style>.tg {border-collapse:collapse;border-spacing:0;border-color:#aaa;}.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#333;background-color:#fff;}.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:bold;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#fff;background-color:#f38630;}</style> <title>JobScheduler Notification</title> </head> <body> <table class="tg"> <tr> <th colspan="4">Error</th> </tr> var fileSize <tr> = new java.lang.Double(%file_size%);<td>Code:</td><td>${MON_N_ERROR_CODE}</td> <td>Messsage</td><td>${MON_N_ERROR_TEXT}</td> </tr> <tr> <th colspan="4">JobScheduler</th> </tr> var timerExpiryFactor<tr> = 0.0025; <td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td> <td>Agent URL</td><td>${MON_N_AGENT_URL}</td> </tr> <tr> <th colspan="4">Order</th> var timerExpiryTolerance </tr> = timerExpiryFactor*0.1; <tr> <td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td> <td>Order Title</td><td>${MON_N_ORDER_TITLE}</td> </tr> <tr> var timerExpiry <td>Job Chain = new java.lang.Double(timerExpiryFactor+timerExpiryTolerance);Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td> <td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td> </tr> <tr> timerExpiry <td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td> = timerExpiry*fileSize;<td>Job Title</td><td>${MON_N_JOB_TITLE}</td> </tr> <tr> <th colspan="4">Task History</th> </tr> <tr> return timerExpiry; <td>Task ID</td><td>${MON_N_TASK_ID}</td> <td>Time elapsed</td><td>${MON_N_TASK_TIME_ELAPSED}</td> </tr> }<tr> <td>Start Time</td><td>${MON_N_TASK_START_TIME}</td> <td>End Time</td><td>${MON_N_TASK_END_TIME}</td> calculate(); ]]></Script> ... |
Message
Usage
The Message can be configured on the following parent nodes as CDATA element :
SystemMonitorNotification / Notification / NotificationCommand
SystemMonitorNotification / Notification / NotificationInterface
The Message can contains:
- fixed values
- variables
Example: <![CDATA[ scheduler id = %MON_N_SCHEDULER_ID% ]]>
Variables
All variables must be defined by using of the %<variable name>%
syntax.
The order of the substitution the variables values is:
- Table variables.
- Service variables.
- OS environment variables.
Tables variables
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table of the history of steps of processed orders.
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
scheduler id = %MON_N_SCHEDULER_ID%, history id = %MON_N_ORDER_HISTORY_ID%, job_chain = %MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), error = %MON_N_ERROR_TEXT% |
Expand | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||
Table of the history of notifications sended to system monitor.
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
step from = %MON_SN_STEP_FROM%, step to = %MON_SN_STEP_TO%, notification = %MON_SN_NOTIFICATIONS% (of %MON_SN_MAX_NOTIFICATIONS%) |
Expand | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||
Table of the history of executed checks (Timer)
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
timer name = %MON_C_NAME%, text = %MON_C_CHECK_TEXT% |
Service variables
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
service name = %SERVICE_NAME% |
OS environment variables
All existing system variables can be defined by message with the syntax like %<variable name>%
(Windows/Unix)
.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
%TEMP%/test.exe |
Examples
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step=%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT% |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), steps(%MON_SN_STEP_FROM% to %MON_SN_STEP_TO%), order time elapsed = %MON_N_ORDER_TIME_ELAPSED%s |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
name = %MON_C_NAME%, scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), steps(%MON_C_STEP_FROM% to %MON_C_STEP_TO%), check = %MON_C_CHECK_TEXT% |
Notification environment variables
The default com.sos.scheduler.notification.plugins.notifier.SystemNotifierProcessBuilderPlugin
plugin used by the SystemMonitorNotification / Notification / NotificationCommand
element sets the following variables as environment variables:
Service variables
Tables variables
These variables can be used when the NotificationCommand calls the notification client not directly, but a shell script, that make the logical implementation for sending of the notification messages.
Service variables
Expand | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
|
Tables variables
Expand | ||
---|---|---|
| ||
All tables variables (see
e.g.:
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
1) configured command in the SystemMonitorNotification_<MonitorSystem>.xml file
<NotificationCommand><![CDATA[/tmp/command.sh]</NotificationCommand>
2) content of the /tmp/command.sh file
#! /bin/sh
# Note: "> /tmp/command_output.txt" used to simulate the starting of the notification client
#
echo $SCHEDULER_MON_SERVICE_NAME:$SCHEDULER_MON_SERVICE_STATUS:$SCHEDULER_MON_SERVICE_MESSAGE_PREFIX history id = $SCHEDULER_MON_TABLE_MON_N_ORDER_HISTORY_ID > /tmp/command_output.txt
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
1) configured command in the SystemMonitorNotification_<MonitorSystem>.xml file
<NotificationCommand><![CDATA[C:/Temp/command.cmd]</NotificationCommand>
2) content of the C:/Temp/command.cmd file
rem Note: "> C:/Temp/command_output.txt" used to simulate the starting of the notification client
rem
echo %SCHEDULER_MON_SERVICE_NAME%:%SCHEDULER_MON_SERVICE_STATUS%:%SCHEDULER_MON_SERVICE_MESSAGE_PREFIX% history id = %SCHEDULER_MON_TABLE_MON_N_ORDER_HISTORY_ID% > C:/Temp/command_output.txt
|
Examples
Examples OP5
NotificationInterface
Here is an except of an XML file used for notifying a specific System Monitor (OP5 Monitor) and using NotificationInterface:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
monitor_host The hostname or ip address of System Monitor host
monitor_port The TCP port that the System Monitor would listen to
monitor_encryption Encryption algorithm
service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor
%MON_N_SCHEDULER_ID% See explanation "Table variables"
...
-->
<NotificationInterface monitor_host="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[
scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step =%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT%
]]></NotificationInterface>
... |
NotificationCommand
Here is an except of an XML file used for notifying a specific System Monitor (OP5 Monitor) and using NotificationCommand on Windows:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor.
monitor_host The hostname or ip address of System Monitor host.
%SERVICE_NAME% See explanation "Service variables"
%SERVICE_STATUS% See explanation "Service variables"
%SERVICE_MESSAGE_PREFIX% See explanation "Service variables"
%MON_N_SCHEDULER_ID% See explanation "Table variables"
...
NotificationCommand after substitution (error case):
<![CDATA[echo service_host:JobScheduler Errors:2:ERROR scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
NotificationCommand after substitution (recovery case):
<![CDATA[echo service_host:JobScheduler Errors:0:RECOVERED scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
NotificationCommand after substitution (success case): <![CDATA[echo service_host:JobScheduler Success:0:scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error= | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
-->
<NotificationMonitor service_name_on_error="JobScheduler Errors" service_name_on_success="JobScheduler Success">
<NotificationCommand><![CDATA[echo service_host:%SERVICE_NAME%:%SERVICE_STATUS%:%SERVICE_MESSAGE_PREFIX%scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step=%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT% | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
</NotificationCommand>
</NotificationMonitor>
... |
Examples Zabbix
NotificationCommand
Here is an except of an XML file used for notifying a specific System Monitor (Zabbix Monitor) and using NotificationCommand
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
zabbix_sender Zabbix sender installed on the JobScheduler host
localhost Hostname of the zabbix server
Zabbix_server JobScheduler Agent name(host name) that registred on Zabbix
samples.job1 Item key of zabbix (replace "/" to "." of JOB_NAME
%MON_N_ERROR_TEXT% See explanation "Table variables"
-->
<NotificationCommand>
<![CDATA[zabbix_sender -z localhost -s zabbix_server -k samples.job1 -o %MON_N_ERROR_TEXT%]]>
</NotificationCommand>
... |
Status | ||||
---|---|---|---|---|
|
JobScheduler - Job Chains
Job Chains for these solutions have to be placed under \live\notification
. Four Job Chains were implemented for this solution and they have the following functions:
CheckHistory
: reads JobScheduler database tables where the logging is placed, analyses them and writes results into another tables, the Notification tables.CleanupNotifications
: deletes entries in the Notification tables. Currently this takes place once every day.ResetNotifications
: sets Status for Notifications in the Notification tables (e.g. Acknowledge)SystemNotifier
: responsible for notifiying the system Monitor about the current notifications. Moreover, this JobChain is responsible for updating the Notification tables after having notified the System Monitor.
System Monitor
- The System Monitor receives just passive checks, that means, there are no active checks for monitoring JobScheduler. The only configuration here is the capability to receive passive checks from a remote host.
- The services in the System Monitor have to be in concordance with the JobScheduler configuration. Passive checks (services) have to be configured and named following the convention used in the XML described above for the JobScheduler (CheckHistoryConfiguration.xml and SystemMonitorNotification_op5.xml).
Use Cases
Recoverable Errors
Initial Situation: A Job Chain is triggered by directory monitoring. That is, when a certain file comes in a monitored folder, the Job Chain starts.
Problem: The Job Chain ended with error.
Handling: The System Monitor will be notified to the service related to the Job Chain with the message error. If a new execution of the Job Chain from a new file end without errors, does not mean that the error is recovered, since the file that has been processed is now another one. That is, the error message at the System Monitor will stay till the same file is again placed in the monitored directory and the Job Chain ends without errors.
Configuration:
- XML
CheckConfigurationHistory.xml
: Indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor. - XML
SystemMonitorNotification.xml
: Specify the name of the Service (in the System Monitor) and specify that it is about aservice_name_on_error
since you want to have the control when the Job Chain ends in an error. - System Monitor: Services in the System Monitor have to be configured and named the same way as in the XML file above
SystemMonitorNotification.xml
.
Workflow Execution takes too long
Initial Situation: A Job Chain is triggered and it could not end, it hanged in a step, taking then longer than expected.
Problem: Execution time was too long
Handling: A timer for this Job Chain is set and the System Monitor will be notified about it. The expiration times for the Job Chains are configured with enough time for processing, that means, this is usually used for cases where the Job Chain hanged in a specific step.
Configuration:
</tr>
</table>
</body>
</html>
|
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-280
JMS Interface to be executed for System Monitor notification.
Note: the provider-specific queue
or topic name
will be defined with the service_name_on_error
/ service_status_on_succes
attribute of the parent SystemMonitorNotification / Notification / NotificationMonitor
element.
NotificationJMS
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
client_id | Optional | The client identifier for this connection. |
destination | Optional | A Possible values:
See: Destination Default: |
acknowledge_mode | Optional | Session acknowledgment mode. Possible values:
See: Session Default: |
delivery_mode | Optional | Delivery mode. Possible values:
See: Default: |
priority | Optional | The producer's default priority. See: MessageProducer.setPriority Default: |
time_to_live | Optional | Sets the default length of time in milliseconds from its dispatch time that a produced message should be retained by the message system. See: MessageProducer.setTimeToLive Possible values:
Default: |
plugin | Optional | Java class of the plugin implementation (extends Default: |
One of the following elements must be nested inside a NotificationJMS
element:
Element | Element description | Description |
---|---|---|
ConnectionFactory | Optional or only once inside the NotificationJMS element | Specifies use of a JMS ConnectionFactory implementation |
ConnectionJNDI | Optional or only once inside the NotificationJMS element | Specifies use of a JNDI properties file to create a JNDI IntialContextFactory |
JMS message:
Element | Element description | Description |
---|---|---|
Message | Required, only once inside of NotificationJMS element | Body of a JMS notification |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionFactory
Specifies use of a JMS ConnectionFactory implementation.
ConnectionFactory
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
java_class | Required | Java class of the JMS ConnectionFactory e.g.: |
user_name | Optional | The caller's user name |
password | Optional | The caller's password |
The following element can be nested inside a ConnectionFactory
element:
Element | Element description | Description |
---|---|---|
ConstructorArguments | Optional or only once inside of ConnectionFactory element |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionFactory / ConstructorArguments
The following elements can be nested inside a ConstructorArguments
element:
Element | Element description | Description |
---|---|---|
Argument | Required, multiple use allowed inside the ConstructorArguments element | JMS ConnectionFactory constructor argument |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<ConnectionFactory java_class="org.apache.activemq.ActiveMQConnectionFactory">
<ConstructorArguments>
<Argument type="java.lang.String"><![CDATA[tcp://localhost:61616]]></Argument>
</ConstructorArguments>
</ConnectionFactory>
...
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<ConnectionFactory java_class="org.apache.activemq.ActiveMQConnectionFactory">
<ConstructorArguments>
<Argument type="java.lang.String"><![CDATA[my_user_name]]></Argument>
<Argument type="java.lang.String"><![CDATA[my_password]]></Argument>
<Argument type="java.lang.String"><![CDATA[tcp://localhost:61616]]></Argument>
</ConstructorArguments>
</ConnectionFactory>
...
|
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionFactory / ConstructorArguments / Argument
Argument
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
type | Required | Java type of a constructor argument. Possible values:
Default: |
The value of the constructor argument will be stored as CDATA
node.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Argument type="java.lang.String"><![CDATA[tcp://localhost:61616]]></Argument>
...
|
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / ConnectionJNDI
Specifies use of a JNDI properties file to create a JNDI IntialContextFactory.
See: Connecting to the JMS Server by Using JNDI
ConnectionJNDI
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
file | Required | Path of the JNDI properties file |
lookup_name | Optional | Name to lookup JMS connection factory objects Default: |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
java.naming.factory.initial=org.apache.activemq.jndi.ActiveMQInitialContextFactory
java.naming.provider.url=tcp://localhost:61616 |
SystemMonitorNotification / Notification / NotificationMonitor / NotificationJMS / Message
Body of a JMS notification.
SystemMonitorNotification / Notification / NotificationObjects
One of the following elements must be nested inside a NotificationObjects
element:
Element | Element description | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Job | Optional or multiple use allowed inside the NotificationObjects element | Restricts notifications for order or standalone jobs | ||||||||||
JobChain | Optional or multiple use allowed inside the NotificationObjects element | Restricts notifications for job chains | ||||||||||
TimerRef | Optional or multiple use allowed inside the NotificationObjects element | Restricts notifications for performance checks (Timer) | ||||||||||
MasterMessage | Optional or only once inside the NotificationObjects element |
Includes problems detected by a JobScheduler Master, e.g. database connection lost. | ||||||||||
TaskWarning | Optional or only once inside the NotificationObjects element |
Includes job execution warning messages. | ||||||||||
TaskIfLongerThan | Optional or only once inside the NotificationObjects element |
Includes the feature to send notification in case that the execution of job requires a longer duration than expected. | ||||||||||
TaskIfShorterThan | Optional or only once inside the NotificationObjects element |
Includes the feature to send notification in case that the execution of job requires a shorter duration than expected. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5">
<Notification>
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors">
...
</NotificationMonitor>
<NotificationObjects>
<!-- Send the job error, occurrent in the "test/my_job" order job, to the "JobScheduler Monitoring Errors" service. -->
<Job name="test/my_job" />
<!-- Send the job chain error, occurrent in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Errors" service. -->
<JobChain name="test/my_jobchain" />
</NotificationObjects>
</Notification>
</SystemMonitorNotification> |
SystemMonitorNotification / Notification / NotificationObjects / Job
This element specifies the order-controlled
or
jobs for which notifications are being sent to a system monitor.standalone
Support for standalone
jobs starting with
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Job
supports the following attributes:
Attribute | Usage | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
notifications | Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: | ||||||||||
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. | ||||||||||
name | Optional | Job name including possible folder names. Regular expression can be used.
| ||||||||||
return_code_from | Optional | Restricts notifications for jobs for a particular return code range. | ||||||||||
return_code_from | Optional | Restricts notifications for jobs for a particular return code range. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Job notifications="2" name="test/my_job"/>
...
<Job scheduler_id="scheduler_4444" />
...
<Job scheduler_id="scheduler_4444" name="test/my_.*" />
...
<Job name="test/my_job" return_code_from="5"/>
...
<Job name="test/my_job" return_code_to="10"/>
...
<Job name="test/my_job" return_code_from="5" return_code_to="5"/>
...
|
SystemMonitorNotification / Notification / NotificationObjects / JobChain
Specifies the job chains for which notifications are being sent to a system monitor.
The element can be repeatedly used to specifiy a number of job chains.
Default behaviour for repeatedly failed job chain steps: when an error reoccurrs in the same job node for which a notification has already been sent then this order state is considered being previously notified and no new notification will be sent.
See child element NotifyRepeatedError
.
JobChain
supports the following attributes:
Attribute | Usage | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
notifications | Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: | ||||||||||
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. | ||||||||||
name | Optional | Job chain name including possible folder names. Regular expression can be used.
| ||||||||||
return_code_from | Optional | Restricts notifications for job chains for a particular return code range. | ||||||||||
return_code_from | Optional | Restricts notifications for job chains for a particular return code range. | ||||||||||
step_from | Optional | Restricts notifications for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. | ||||||||||
step_to | Optional | Restricts notifications for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. | ||||||||||
excluded_steps | Optional | Specifies the steps which will be excluded from the analyzing (separated by semicolon) |
The following element can be nested inside a JobChain
element:
Element | Element description | Description |
---|---|---|
NotifyRepeatedError | Optional or only once inside the JobChain element | Send notifications for all errors that occur, do not suppress errors for repeatedly failed executions. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<JobChain notifications="2" name="test/my_jobchain"/>
...
<JobChain scheduler_id="scheduler_4444" />
...
<JobChain scheduler_id="scheduler_4444" name="test/my_.*" />
...
<JobChain name="test/my_jobchain" return_code_from="5"/>
...
<JobChain name="test/my_jobchain" return_code_to="10"/>
...
<JobChain name="test/my_jobchain" return_code_from="5" return_code_to="5"/>
...
<JobChain name="test/my_jobchain" step_from="200"/>
...
<JobChain name="test/my_jobchain" step_to="500"/>
...
<JobChain name="test/my_jobchain" step_from="300" step_to="300"/>
...
<JobChain name="test/my_jobchain" excluded_steps="200;300"/>
...
<JobChain name="test/my_jobchain">
<NotifyRepeatedError />
</JobChain>
... |
SystemMonitorNotification / Notification / NotificationObjects / JobChain / NotifyRepeatedError
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Send notifications for all errors that occur, do not suppress errors for repeatedly failed executions.
One of the following elements can be nested inside a NotifyRepeatedError
element:
Element | Element description | Description |
---|---|---|
NotifyByIntervention | Optional or only once inside the NotifyRepeatedError element | Send notifications for errors that occur due to repeated failed executions if the restart was caused by manual intervention. |
NotifyByPeriod | Optional or only once inside the NotifyRepeatedError element | Send notifications for errors that occur due to repeatedly failed executions if a configurable period of time is exceeded. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<JobChain name="test/my_jobchain">
<NotifyRepeatedError>
<NotifyByIntervention />
</NotifyRepeatedError>
</JobChain>
...
<JobChain name="test/my_jobchain">
<NotifyRepeatedError>
<NotifyByPeriod period="5h 30m" />
</NotifyRepeatedError>
</JobChain>
...
<JobChain name="test/my_jobchain">
<NotifyRepeatedError>
<NotifyByIntervention />
<NotifyByPeriod period="2h" />
</NotifyRepeatedError>
</JobChain>
... |
SystemMonitorNotification / Notification / NotificationObjects / JobChain / NotifyRepeatedError / NotifyByIntervention
Send notifications for errors that occur due to repeated failed executions if the restart was caused by manual intervention.
SystemMonitorNotification / Notification / NotificationObjects / JobChain / NotifyRepeatedError / NotifyByPeriod
Send notifications for errors that occur due to repeatedly failed executions if a configurable period of time is exceeded.
NotifyByPeriod
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
period | Required | The period between notifications is calculated from the time of the last failed execution for which a notification has been sent and the time of the current failed execution. Possible values:
|
SystemMonitorNotification / Notification / NotificationObjects / TimerRef
TimerRef
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
notifications | Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
ref | Required | Corresponds with Timer name setting defined in the SystemMonitorNotification / Timer element |
notify_on_error | Optional Boolean | Send timer check notification when the configured job chain contains the error notifications. Default: |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5">
<Notification>
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Error">
...
</NotificationMonitor>
<NotificationObjects>
<!--
Send the job chain error, occurring in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Errors" service.
-->
<JobChain name="test/my_jobchain" />
</NotificationObjects>
</Notification>
<Notification>
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Performance">
...
</NotificationMonitor>
<NotificationObjects>
<!--
Sends the performance check error, occurring in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Performance" service.
Sends the performance check error to the "JobScheduler Monitoring Performance" service will be ignored when the "test/my_jobchain" has the job chain error (default notify_on_error = false).
-->
<TimerRef ref="my_timer" />
</NotificationObjects>
</Notification>
<Timer name="my_timer">
<TimerJobChain name="test/my_jobchain" />
</Timer>
</SystemMonitorNotification> |
SystemMonitorNotification / Notification / NotificationObjects / MasterMessage
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1837
MasterMessage
includes problems detected by a JobScheduler Master, e.g. database connection lost.
Requirements:
-
./config/factory.ini
configuration file:mail_queue_only=true
mail_queue_dir
setting specifies a directory to store the JobScheduler mailsmail_on_warning=true
mail_on_error=true
MasterMessage
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Notification / NotificationObjects / TaskWarning
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1837
TaskWarning
includes job execution warning messages.
Requirements:
-
./config/factory.ini
configuration file:mail_queue_only=true
mail_queue_dir
setting specifies a directory to store the JobScheduler mailsmail_on_warning=true
TaskWarning
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Notification / NotificationObjects / TaskIfLongerThan
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-522
TaskIfLongerThan
includes the feature to send notification in case that the execution of job requires a longer duration than expected.
Requirements:
- The Job configuration contains the
warn_if_longer_than
setting. ./config/factory.ini
configuration file:mail_queue_only=true
mail_queue_dir
setting specifies a directory to store the JobScheduler mailsmail_on_warning=true
TaskIfLongerThan
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Notification / NotificationObjects / TaskIfShorterThan
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-522
TaskIfShorterThan
includes the feature to send notification in case that the execution of job requires a shorter duration than expected.
Requirements:
The Job configuration contains the
warn_if_shorter_than
setting../config/factory.ini
configuration file:mail_queue_only=true
mail_queue_dir
setting specifies a directory to store the JobScheduler mailsmail_on_warning=true
TaskIfShorterThan
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
| Optional Integer | Specifies the number of transfers the same notification to a System Monitor. Default: |
SystemMonitorNotification / Timer
The following elements must be nested inside a Timer
element:
Element | Element description | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
TimerJob | Optional or multiple use allowed inside the Timer element | Restricts notifications for
| ||||||||||
TimerJobChain | Optional or multiple use allowed inside the Timer element | Restricts notifications for job chains | ||||||||||
Minimum | Optional or only once inside the Timer element | Minimum required execution time for job chains or selected job nodes. Allows script code to be executed that returns the minimum execution time in seconds. | ||||||||||
Maximum | Optional or only once inside the Timer element | Maximum allowed execution time for job chains or selected job nodes. Allows script code to be executed that returns the maximum execution time in seconds. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<SystemMonitorNotification system_id="op5">
...
<Timer name="my_timer_1">
<TimerJobChain name="test/my_jobchain_1" />
<TimerJob name="test/my_job_1" />
<Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum>
</Timer>
<Timer name="my_timer_2">
<TimerJobChain name="test/my_jobchain_2" />
<TimerJobChain name="test/my_jobchain_3" />
<Minimum><Script language="javascript"><![CDATA[500]]></Script></Minimum>
<Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum>
</Timer>
</SystemMonitorNotification> |
Timer
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
name | Required | Corresponds to Timer used in the The name must be unique across all timers definitions. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Timer name="my_timer">
... |
SystemMonitorNotification / Timer / TimerJob
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JITL-401
TimerJob
supports the following attributes:
Atribute | Usage | Description |
---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. |
name | Optional | Job name including possible folder names. Regular expression can be used. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<TimerJob scheduler_id="scheduler_4444" />
...
<TimerJob scheduler_id="scheduler_4444" name="test/my_.*" />
...
<TimerJob name="test/my_job"/>
...
|
SystemMonitorNotification / Timer / TimerJobChain
TimerJobChain
supports the following attributes:
Attribute | Usage | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
scheduler_id | Optional | Notifications are restricted to the JobScheduler instance with the given identification. By default notifications will be sent for all JobScheduler instances that log into the same database. Regular expression can be used. | ||||||||||
name | Optional | Job chain name including possible folder names. Regular expression can be used.
| ||||||||||
step_from | Optional | Restricts checks for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. | ||||||||||
step_to | Optional | Restricts checks for job chains to a sequence of job nodes that are specified with the step_from and step_to attributes. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<TimerJobChain scheduler_id="scheduler_4444" />
...
<TimerJobChain scheduler_id="scheduler_4444" name="test/my_.*" />
...
<TimerJobChain name="test/my_jobchain" step_from="200"/>
...
<TimerJobChain name="test/my_jobchain" step_to="500"/>
...
<TimerJobChain name="test/my_jobchain" step_from="300" step_to="300"/>
... |
SystemMonitorNotification / Timer / Minimum
The following elements must be nested inside a Minimum
element:
Element | Element description | Description |
---|---|---|
Script | Required, only once inside the Minimum element | Script code in one of the supported languages |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Timer name="my_timer">
...
<Minimum><Script language="javascript"><![CDATA[1000]]></Script></Minimum>
</Timer>
... |
SystemMonitorNotification / Timer / Maximum
The following elements must be nested inside a Maximum
element:
Element | Element description | Description |
---|---|---|
Script | Required, only once inside the Maximum element | Script code in one of the supported languages |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Timer name="my_timer">
...
<Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum>
</Timer>
... |
SystemMonitorNotification / Timer / Minimum|Maximum / Script
Script
supports the following attributes:
Attribute | Usage | Description |
---|---|---|
language | Required | Script language name Supported languages:
|
The Script element can contain:
- a fixed value
- a calculation based on the job/order parameters
Fixed value
A fixed value is the time allowed in seconds for the specific Minimum
or Maximum
definition
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Script language="javascript"><![CDATA[1000]]></Script>
... |
Calculation
The calculation is to result in the time in seconds for the specific Minimum
or Maximum
definition.
This example calculates the execution time depending on the ${file_size}
parameter that was set by a specific job (see the example below)´.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<Script language="javascript"><![CDATA[
function my_calculate(){
var fileSize = new java.lang.Double(${file_size});
var timerExpiryFactor = 0.0025;
var timerExpiryTolerance = timerExpiryFactor*0.1;
var timerExpiry = new java.lang.Double(timerExpiryFactor+timerExpiryTolerance);
timerExpiry = timerExpiry*fileSize;
return timerExpiry;
}
my_calculate();
]]></Script>
... |
This example job calculates and creates a new order parameter file_size
.
To store the parameters into database (table SCHEDULER_MON_RESULTS
) :
- set the
scheduler_notification_result_parameters
parameter (see job documentationjobs/JobSchedulerNotificationStoreResultsJob.xml
) - set the
StoreResultsJobJSAdapterClass
as monitor- JobScheduler releases before 1.11
com.sos.scheduler.notification.jobs.result.StoreResultsJobJSAdapterClass
JobScheduler releases starting from 1.11
com.sos.jitl.notification.jobs.result.StoreResultsJobJSAdapterClass
- JobScheduler releases before 1.11
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<?xml version="1.0" encoding="ISO-8859-1"?>
<job title="Sample Job with Store Result Monitor" order="yes" stop_on_error="no" tasks="1">
<params>
<!--
set the scheduler_notification_result_parameters parameter
-->
<param name="scheduler_notification_result_parameters" value="file_size"/>
</params>
<!--
calculate and create the new order parameter if necessary
-->
<script language="java:javascript"><![CDATA[
function spooler_process(){
var order = spooler_task.order;
var params = spooler.create_variable_set();
params.merge(spooler_task.params);
params.merge(order.params);
// parameter scheduler_file_path was set in the previous job chain step
var file = new java.io.File(params.value("scheduler_file_path"));
var fileSize = file.length()/1024;
order.params.set_var("file_size",fileSize.toString());
return true;
}]]>
</script>
<!--
set the StoreResultsJobJSAdapterClass as a monitor
-->
<monitor name="notification_monitor" ordering="1">
<!-- JobScheduler releases before 1.11
<script java_class="com.sos.scheduler.notification.jobs.result.StoreResultsJobJSAdapterClass" language="java"/>
-->
<!-- JobScheduler releases starting from 1.11 -->
<script java_class="com.sos.jitl.notification.jobs.result.StoreResultsJobJSAdapterClass" language="java"/>
</monitor>
<run_time />
</job> |
Message
Anchor | ||||
---|---|---|---|---|
|
Usage
The Message can be configured on the following parent nodes as a CDATA element :
SystemMonitorNotification / Notification /
NotificationInterface
SystemMonitorNotification / Notification / NotificationCommand
SystemMonitorNotification / Notification / NotificationMail
Subject
Body
SystemMonitorNotification / Notification / NotificationJMS / Message
The Message can contain:
- Fixed values
- Variables
Example: <![CDATA[ scheduler id = ${MON_N_SCHEDULER_ID} ]]>
Variables
All variables (except OS environment variables) must be defined by using of the
${<variable name>}
syntax.
Note:
- Syntax for the JobScheduler version 1.10.6 and higher. Syntax for the JobScheduler version 1.10.4, 1.10.5 (see below) is still supported.
- Syntax for the JobScheduler version 1.10.4, 1.10.5: {<variable name>}
- Syntax for the JobScheduler previous versions:
%<variable name>%
The order of the substitution the variables values is:
- Table variables.
- Service variables.
- JOC Cockpit variables.
- OS environment variables.
Table variables
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table of the history of steps of processed orders / jobs.
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
scheduler id = ${MON_N_SCHEDULER_ID}, history id = ${MON_N_ORDER_HISTORY_ID}, job_chain = ${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), error = ${MON_N_ERROR_TEXT} |
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table of the history of notifications sent to a system monitor.
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
step from = ${MON_SN_STEP_FROM}, step to = ${MON_SN_STEP_TO}, notification = ${MON_SN_CURRENT_NOTIFICATION} (of ${MON_SN_NOTIFICATIONS}) |
Expand | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||
Table of the history of executed checks (Timer)
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
timer name = ${MON_C_NAME}, text = ${MON_C_CHECK_TEXT} |
Service variables
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
service name = ${SERVICE_NAME} |
JOC Cockpit variables
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1388
Note:
- the JOC Cockpit variables will be substituted only when the
NotificationMail
interface is used.
Requirement:
-
config/private/private.conf
configuration file is active and contains the configuredjoc.url
entry.
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a>
<a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a>
<a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}</a>
|
OS environment variables
All existing OS environment variables can be defined by message using the syntax %<variable name>%
(Windows) or $<variable name>
(Unix)
.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
%TEMP%/test.exe |
Notification environment variables
The default SystemNotifierProcessBuilderPlugin
plugin used by the SystemMonitorNotification / Notification / NotificationCommand
element sets the following variables as environment variables:
Table variables
Service variables
These variables can be used when the NotificationCommand calls the notification client - not directly but via a shell script that makes the logical implementation for sending the notification messages.
Table variables
All table variables (see Table variables
explanation) are set as environment variables with the prefix:
SCHEDULER_MON_TABLE_
e.g.:
SCHEDULER_MON_TABLE_MON_N_ID
SCHEDULER_MON_TABLE_MON_N_SCHEDULER_ID
...
Service variables
Name | Description |
---|---|
| Current service name. One of both element attributes:
|
| Current service status. One of both element attributes or default:
|
|
|
| Content of the SystemMonitorNotification / Notification / NotificationCommand after substitution |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
1) configured command in the SystemMonitorNotification_<MonitorSystem>.xml file
<NotificationCommand><![CDATA[/tmp/command.sh]</NotificationCommand>
2) content of the /tmp/command.sh file
#! /bin/sh
# Note: "> /tmp/command_output.txt" is used to simulate the starting of the notification client
#
echo "$SCHEDULER_MON_SERVICE_NAME:$SCHEDULER_MON_SERVICE_STATUS:$SCHEDULER_MON_SERVICE_MESSAGE_PREFIX history id = $SCHEDULER_MON_TABLE_MON_N_ORDER_HISTORY_ID" > /tmp/command_output.txt
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
1) configured command in the SystemMonitorNotification_<MonitorSystem>.xml file
<NotificationCommand><![CDATA[C:/Temp/command.cmd]</NotificationCommand>
2) content of the C:/Temp/command.cmd file
rem Note: "> C:/Temp/command_output.txt" is used to simulate the starting of the notification client
rem
echo %SCHEDULER_MON_SERVICE_NAME%:%SCHEDULER_MON_SERVICE_STATUS%:%SCHEDULER_MON_SERVICE_MESSAGE_PREFIX% history id = %SCHEDULER_MON_TABLE_MON_N_ORDER_HISTORY_ID% > C:/Temp/command_output.txt
|
Examples
Anchor | ||||
---|---|---|---|---|
|
Message on error
Code Block | ||
---|---|---|
| ||
scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step=${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} |
Message on success
Code Block | ||
---|---|---|
| ||
scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), steps(${MON_SN_STEP_FROM} to ${MON_SN_STEP_TO}), order time elapsed = ${MON_N_ORDER_TIME_ELAPSED}s |
Message on performance check (Timer)
Code Block | ||
---|---|---|
| ||
name = ${MON_C_NAME}, scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), steps(${MON_C_STEP_FROM} to ${MON_C_STEP_TO}), check = ${MON_C_CHECK_TEXT} |
Examples System Monitoring
Anchor example_op5 example_op5
NotificationInterface ( Nagios / OP5 )
example_op5 | |
example_op5 |
The following is an except from an XML file used to notify a specific System Monitor (op5 Monitor) via the NotificationInterface:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
monitor_host The hostname or ip address of System Monitor host
monitor_port The TCP port that the System Monitor would listen to
monitor_encryption Encryption algorithm
service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor
{MON_N_SCHEDULER_ID} See explanation "Table variables"
...
-->
<NotificationInterface monitor_host="monitor_host"
monitor_port="5667"
monitor_encryption="XOR"
service_host="service_host"><![CDATA[
scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step =${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT}
]]></NotificationInterface>
... |
NotificationCommand ( Nagios / OP5 )
The following is an except from an XML file used to notifying a specific System Monitor (op5 Monitor) via the NotificationCommand:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor.
monitor_host The hostname or ip address of System Monitor host.
{SERVICE_NAME} See explanation "Service variables"
{SERVICE_STATUS} See explanation "Service variables"
{SERVICE_MESSAGE_PREFIX} See explanation "Service variables"
{MON_N_SCHEDULER_ID} See explanation "Table variables"
...
NotificationCommand after substitution (error case):
<![CDATA[echo service_host:JobScheduler Monitoring Errors:2:ERROR scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
NotificationCommand after substitution (recovery case):
<![CDATA[echo service_host:JobScheduler Monitoring Errors:0:RECOVERED scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
NotificationCommand after substitution (success case):
<![CDATA[echo service_host:JobScheduler Monitoring Success:0:SUCCESS scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error= | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
-->
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors"
service_name_on_success="JobScheduler Monitoring Success"
service_status_on_error="2"
service_status_on_success="0">
<NotificationCommand><![CDATA[echo service_host:${SERVICE_NAME}:${SERVICE_STATUS}:${SERVICE_MESSAGE_PREFIX} scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step=${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT} | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
</NotificationCommand>
</NotificationMonitor>
... |
NotificationCommand ( Nagios / Opsview )
The following is an except from an XML file used to notifying a specific System Monitor (Opsview Monitor) via the NotificationCommand on Unix:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
service_host The host that executes the passive check. The name must match the corresponding setting in the System Monitor. e.g- localhost
monitor_host The hostname or ip address of System Monitor host.
{SERVICE_NAME} See explanation "Service variables"
{SERVICE_STATUS} See explanation "Service variables"
{SERVICE_MESSAGE_PREFIX} See explanation "Service variables"
{MON_N_SCHEDULER_ID} See explanation "Table variables"
...
NotificationCommand after substitution (error case):
<![CDATA[echo -e "localhost\tJobScheduler Monitoring Errors\t2\tERROR scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]>
NotificationCommand after substitution (recovery case):
<![CDATA[echo -e "localhost\tJobScheduler Monitoring Errors\t0\tRECOVERED scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]>
NotificationCommand after substitution (success case):
<![CDATA[echo -e "localhost\tJobScheduler Monitoring Success\t0\tSUCCESS scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]>
-->
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors"
service_name_on_success="JobScheduler Monitoring Success"
service_status_on_error="2"
service_status_on_success="0">
<NotificationCommand><![CDATA[echo -e "service_host\t${SERVICE_NAME}\t${SERVICE_STATUS}\t${SERVICE_MESSAGE_PREFIX} scheduler id=${MON_N_SCHEDULER_ID}, history id=${MON_N_ORDER_HISTORY_ID}, job_chain=${MON_N_JOB_CHAIN_NAME}(${MON_N_ORDER_ID}), step=${MON_N_ORDER_STEP_STATE}, error=${MON_N_ERROR_TEXT}\n" | /usr/local/nagios/bin/send_nsca -H monitor_host -c /usr/local/nagios/etc/send_nsca.cfg]]>
</NotificationCommand>
</NotificationMonitor>
... |
Anchor example_zabbix example_zabbix
NotificationCommand ( Zabbix )
example_zabbix | |
example_zabbix |
The following is an except from an XML file used to notify a specific System Monitor (Zabbix Monitor) and using NotificationCommand
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<!--
zabbix_sender Zabbix sender installed on the JobScheduler host
localhost Hostname of the zabbix server
Zabbix_server JobScheduler Agent name(host name) that registred on Zabbix
samples.job1 Item key of zabbix (replace "/" to "." of JOB_NAME
${MON_N_ERROR_TEXT} See explanation "Table variables"
-->
<NotificationCommand>
<![CDATA[zabbix_sender -z localhost -s zabbix_server -k samples.job1 -o ${MON_N_ERROR_TEXT}]]>
</NotificationCommand>
... |
Examples Mail
NotificationMail content_type="text/html"
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<NotificationMail content_type="text/html" charset="ISO-8859-1" encoding="7bit" priority="Normal">
<From><![CDATA[jobscheduler@sos-berlin.com]]></From>
<To><![CDATA[spam@sos-berlin.com]]></To>
<Subject><![CDATA[JobScheduler notification: ${SERVICE_MESSAGE_PREFIX}, job chain executed with errors: ${MON_N_JOB_CHAIN_NAME}]]></Subject>
<Body><![CDATA[<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;border-color:#bbb;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#594F4F;background-color:#E0FFEB;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#493F3F;background-color:#9DE0AD;}
</style>
<table class="tg">
<tr>
<th colspan="4">Error</th>
</tr>
<tr>
<td>Code:</td><td>${MON_N_ERROR_CODE}</td>
<td>Messsage</td><td>${MON_N_ERROR_TEXT}</td>
</tr>
<tr>
<th colspan="4">JobScheduler</th>
</tr>
<tr>
<td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td>
<td>Agent URL</td><td>${MON_N_AGENT_URL}</td>
</tr>
<tr>
<th colspan="4">Order</th>
</tr>
<tr>
<td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td>
<td>Order Title</td><td>${MON_N_ORDER_TITLE}</td>
</tr>
<tr>
<td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td>
<td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td>
</tr>
<tr>
<td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td>
<td>Job Title</td><td>${MON_N_JOB_TITLE}</td>
</tr>
<tr>
<th colspan="4">Order History</th>
</tr>
<tr>
<td>Time elapsed</td><td>${MON_N_ORDER_TIME_ELAPSED}</td><td> </td><td> </td>
</tr>
<tr>
<td>Start Time</td><td>${MON_N_ORDER_START_TIME}</td>
<td>End Time</td><td>${MON_N_ORDER_END_TIME}</td>
</tr>
<tr>
<th colspan="4">Order Step History</th>
</tr>
<tr>
<td>State</td><td>${MON_N_ORDER_STEP_STATE}</td>
<td>Time elapsed</td><td>${MON_N_ORDER_STEP_TIME_ELAPSED}</td>
</tr>
<tr>
<td>Start Time</td><td>${MON_N_ORDER_STEP_START_TIME}</td>
<td>End Time</td><td>${MON_N_ORDER_STEP_END_TIME}</td>
</tr>
</table>]]></Body>
</NotificationMail>
... |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<NotificationMail content_type="text/html" charset="ISO-8859-1" encoding="7bit" priority="Normal">
<From><![CDATA[jobscheduler@sos-berlin.com]]></From>
<To><![CDATA[spam@sos-berlin.com]]></To>
<Subject><![CDATA[JobScheduler notification: job chain successfully completed: ${MON_N_JOB_CHAIN_NAME}]]></Subject>
<Body><![CDATA[<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;border-color:#aaa;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#333;background-color:#fff;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#fff;background-color:#f38630;}
</style>
<table class="tg">
<tr>
<th colspan="4">JobScheduler</th>
</tr>
<tr>
<td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td>
<td>Agent URL</td><td>${MON_N_AGENT_URL}</td>
</tr>
<tr>
<th colspan="4">Order</th>
</tr>
<tr>
<td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td>
<td>Order Title</td><td>${MON_N_ORDER_TITLE}</td>
</tr>
<tr>
<td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td>
<td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td>
</tr>
<tr>
<th colspan="4">Order History</th>
</tr>
<tr>
<td>Time elapsed</td><td>${MON_N_ORDER_TIME_ELAPSED}</td><td> </td><td> </td>
</tr>
<tr>
<td>Start Time</td><td>${MON_N_ORDER_START_TIME}</td>
<td>End Time</td><td>${MON_N_ORDER_END_TIME}</td>
</tr>
</table>]]></Body>
</NotificationMail>
... |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<NotificationMail content_type="text/html" charset="ISO-8859-1" encoding="7bit" priority="Normal">
<From><![CDATA[jobscheduler@sos-berlin.com]]></From>
<To><![CDATA[spam@sos-berlin.com]]></To>
<Subject><![CDATA[JobScheduler notification: ${SERVICE_MESSAGE_PREFIX}, job executed with errors: ${MON_N_JOB_NAME}]]></Subject>
<Body><![CDATA[<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;border-color:#bbb;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#594F4F;background-color:#E0FFEB;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#bbb;color:#493F3F;background-color:#9DE0AD;}
</style>
<table class="tg">
<tr>
<th colspan="4">Error</th>
</tr>
<tr>
<td>Code:</td><td>${MON_N_ERROR_CODE}</td>
<td>Messsage</td><td>${MON_N_ERROR_TEXT}</td>
</tr>
<tr>
<th colspan="4">JobScheduler</th>
</tr>
<tr>
<td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td>
<td>Agent URL</td><td>${MON_N_AGENT_URL}</td>
</tr>
<tr>
<th colspan="4">Order</th>
</tr>
<tr>
<td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td>
<td>Order Title</td><td>${MON_N_ORDER_TITLE}</td>
</tr>
<tr>
<td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td>
<td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td>
</tr>
<tr>
<td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td>
<td>Job Title</td><td>${MON_N_JOB_TITLE}</td>
</tr>
<tr>
<th colspan="4">Task History</th>
</tr>
<tr>
<td>Task ID</td><td>${MON_N_TASK_ID}</td>
<td>Time elapsed</td><td>${MON_N_TASK_TIME_ELAPSED}</td>
</tr>
<tr>
<td>Start Time</td><td>${MON_N_TASK_START_TIME}</td>
<td>End Time</td><td>${MON_N_TASK_END_TIME}</td>
</tr>
</table>]]></Body>
</NotificationMail>
... |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<NotificationMail content_type="text/html" charset="ISO-8859-1" encoding="7bit" priority="Normal">
<From><![CDATA[jobscheduler@sos-berlin.com]]></From>
<To><![CDATA[spam@sos-berlin.com]]></To>
<Subject><![CDATA[JobScheduler notification: job successfully completed: ${MON_N_JOB_NAME}]]></Subject>
<Body><![CDATA[<style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;border-color:#aaa;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#333;background-color:#fff;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:#aaa;color:#fff;background-color:#f38630;}
</style>
<table class="tg">
<tr>
<th colspan="4">JobScheduler</th>
</tr>
<tr>
<td>JobScheduler ID</td><td>${MON_N_SCHEDULER_ID}</td>
<td>Agent URL</td><td>${MON_N_AGENT_URL}</td>
</tr>
<tr>
<th colspan="4">Order</th>
</tr>
<tr>
<td>Order ID</td><td><a href="${JOC_HREF_ORDER}">${MON_N_ORDER_ID}</a></td>
<td>Order Title</td><td>${MON_N_ORDER_TITLE}</td>
</tr>
<tr>
<td>Job Chain Name</td><td><a href="${JOC_HREF_JOB_CHAIN}">${MON_N_JOB_CHAIN_NAME}</a></td>
<td>Job Chain Title</td><td>${MON_N_JOB_CHAIN_TITLE}</td>
</tr>
<tr>
<td>Job Name</td><td><a href="${JOC_HREF_JOB}">${MON_N_JOB_NAME}${MON_N_JOB_NAME}</a></td>
<td>Job Title</td><td>${MON_N_JOB_TITLE}</td>
</tr>
<tr>
<th colspan="4">Task History</th>
</tr>
<tr>
<td>Task ID</td><td>${MON_N_TASK_ID}</td>
<td>Time elapsed</td><td>${MON_N_TASK_TIME_ELAPSED}</td>
</tr>
<tr>
<td>Start Time</td><td>${MON_N_TASK_START_TIME}</td>
<td>End Time</td><td>${MON_N_TASK_END_TIME}</td>
</tr>
</table>]]></Body>
</NotificationMail>
... |
JobScheduler - Store parameters to database
The Monitoring Interface provide functionality to store the job/order parameters of the specific jobs into database (table SCHEDULER_MON_RESULTS
).
See explanation : Calculation
JobScheduler - Job Chains
The following job chains are provided and should be configured accordingly:
sos / notification / CheckHistory (JobScheduler releases before 1.11)
See <scheduler_install>/jobs/JobSchedulerNotificationCheckHistoryJob.xml
- This is the main job that analyze the JobScheduler history tables und write results into the notification tables.
- Job read all history entries for the job chains, configured in the
SystemMonitorNotification
XML files. - Job execute the performance checks for the defined
Timers
- Job read all history entries for the job chains, configured in the
- Order
Check
- configure repeat interval for order run time, e.g. every two minutes.
sos / notification / CheckHistory (JobScheduler releases starting from 1.11)
- Job chain removed
Set param
sos.use_notification true (config/scheduler.xml)
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
...
<spooler>
<config ...>
<params>
...
<param name="sos.use_notification" value="true"/>
...
</spooler> |
sos / notification / SystemNotifier
See <scheduler_install>/jobs/JobSchedulerNotificationSystemNotifierJob.xml
- Sends notifications to a specific System Monitor.
- Order
MonitorSystem
- JobScheduler releases before 1.11
- configures a repeat interval for the order run time that is not less than the interval that has been chosen for triggering the job chain
sos/notification/CheckHistory
- configures a repeat interval for the order run time that is not less than the interval that has been chosen for triggering the job chain
- JobScheduler releases before 1.11
sos / notification / CleanupNotifications
See <scheduler_install>/jobs/JobSchedulerNotificationCleanupNotificationsJob.xml
- Removes notifications that have expired.
- Order
Cleanup
- configure start time for order run time, e.g. 24:00
sos / notification / ResetNotifications
See <scheduler_install>/jobs/JobSchedulerNotificationResetNotificationsJob.xml
- Some System Monitors may provide an "acknowledge" operation, that signaling has known problem.
- Should an "acknowledge" operation have been performed for a specific service in the System Monitor then job chain
ResetNotifications
would stop JobScheduler from sending notifications for that service for errors that have already occurred. - Do not configure the order run time for this job chain, as job chain will be triggered by the System Monitor's "acknowledge" operation via add_order XML command.
Examples
Example ResetNotifications <add_order> XML command
The following example shows the XML command sent from a monitoring system to the JobScheduler to call the sos/notification/ResetNotifications
job chain and set the relevant service name as acknowledged.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<add_order job_chain ="sos/notification/ResetNotifications"
id ="op5 JobScheduler Monitoring Error acknowledgement"
title ="op5 JobScheduler Monitoring Error acknowledgement">
<params>
<param name="service_name" value="JobScheduler Monitoring Error" />
<param name="system_id" value="op5"/>
<param name="operation" value="acknowledge" />
</params>
</add_order> |
Key to the above code:
Element | Attribute | Value | Description | |
---|---|---|---|---|
add_order | XML Command to add the new order to the specified job chain on the JobScheduler. | |||
job_chain | sos/notification/ResetNotifications | Job chain path must correspond with the path of the ResetNotifications job chain installed on the JobScheduler. | ||
id | Order identifier. | |||
title | Order title. | |||
param | 3 following parameters must be set: | |||
name | service_name | JobScheduler Monitoring Error | Relevant service name to set all already occured service errors in JobScheduler Interface Monitor as acknowledged. | |
name | system_id | op5 | System identification. Corresponds with | |
name | operation | acknowledge | Fixed value. Operation name to execute the acknowledgement in the JobScheduler Monitoring Interface. |
Example ResetNotifications <add_order> XML command via Perl script for op5 monitor system
This example shows the integration of a Perl script into op5 monitor system that automatically sends the above XML command to the JobScheduler sos/notification/ResetNotifications
job chain.
The "Acknowledgment" on the op5 Monitor side works as follows:
- Contact "acknowledgment" + Event Handler:
- it first of all requires a contact, that receives the Notifications in the same way as the other contacts. However, an event notification for this contact is not received via Mail but an Event Handler, i.e. an XML command will be executed instead of a mail being received. (Please see the next point,
Notification Command
.)
- it first of all requires a contact, that receives the Notifications in the same way as the other contacts. However, an event notification for this contact is not received via Mail but an Event Handler, i.e. an XML command will be executed instead of a mail being received. (Please see the next point,
- The "svc_notify_ack_handle"
Notification Command
:- this command will always be executed for the services that are specified for the contact. This command is executed when the service status changes (for example, by a change from
OK
toCritical
orAcknowledgment
of an Error). - The command executes a
check_acknowledge.pl
script.
- this command will always be executed for the services that are specified for the contact. This command is executed when the service status changes (for example, by a change from
- The
check_acknowledge.pl
Script (see the example below): this script is executed by the command and first of all checks whether the command is a response to anAcknowledgment
: - If the command is not a response to an
Acknowledgment
: then nothing happens - If the command is a response to an
Acknowledgment
: then the script causes the JobScheduler to be contacted and sent am XML query, that instructs the JobScheduler to start a specific job chain (thesos/notification/ResetNotifications
chain)
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/usr/bin/perl -w
use strict;
use LWP::UserAgent;
use HTTP::Request::Common;
use Getopt::Long;
use vars qw($opt_H $opt_f $opt_s $opt_p $opt_t $opt_h);
use vars qw(%ERRORS &support);
my $host;
my $type;
my $service;
my $port;
my $timeout = 30;
our %ERROR;
%ERRORS = (
'OK' => 0,
'CRITICAL' => 2,
'ERROR' => 2,
'UNKNOWN' => 9,
'WARNING' => 1,
);
sub print_help ();
sub print_usage ();
Getopt::Long::Configure('bundling');
GetOptions
("h" => \$opt_h, "help" => \$opt_h,
"H=s" => \$opt_H, "hostname=s" => \$opt_H,
"f=s" => \$opt_f,
"s=s" => \$opt_s, "service=s" => \$opt_s,
"t=i" => \$opt_t, "timeout=i" => \$opt_t,
"p=i" => \$opt_p, "port=i" => \$opt_p);
if($opt_h) {print_help(); exit 0;}
if($opt_H ) {
if ( $opt_H =~ /([-.A-Za-z0-9]+)/ ) { $host = $opt_H; }
($host) || print("Invalid host: $opt_H\n");
}
else{ print("Host name/address not specified\n");}
if($opt_p ) {
if ($opt_p =~ /([0-9]+)/) { $port = $1 if ($opt_p =~ /([0-9]+)/);}
($port < 0 || $port > 65535) && print("Invalid Port: $opt_p\n");
}
else{ print("Port not specified\n");}
if ($opt_t) { $timeout = $opt_t; }
if( !$host || !$port ) { print_usage(); exit 1;}
#<add_order job_chain ="/sos/notification/ResetNotifications"
# id ="op5 JobScheduler Monitoring Error acknowledgement"
# title ="op5 JobScheduler Monitoring Error acknowledgement">
# <params>
# <param name="service_name" value="JobScheduler Monitoring Error" />
# <param name="system_id" value="op5"/>
# <param name="operation" value="acknowledge" />
# </params>
#</add_order>
my $message = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?><add_order job_chain=\"/sos/notification/ResetNotifications\" id=\"op5 ".$opt_s." Acknowledegment\" title=\"op5 ".$opt_s." Acknwoledgement\"><params><param name=\"system_id\" value=\"MonitorSystem\"/><param name=\"service_name\" value=\"".$opt_s."\"/><param name=\"operation\" value=\"acknowledge\"/></params></add_order>";
if($opt_f=~m/ACKNOWLEDGEMENT/){
send_request($message);
}
else{ print("Please set notification type to ACKNOWLEDGEMENT\n");}
sub send_request {
my $message = shift;
my $userAgent = LWP::UserAgent->new(agent => 'perl post');
$userAgent->timeout($timeout);
my $response = $userAgent->request(POST 'http://'.$host.':'.$port,Content_Type => 'text/xml',Content => $message);
if ($response->is_success) {
_report('OK', "OK: Service name: ".$opt_s."\nNotification type: ".$opt_f."\nRequest: ". $message."\n\nAnswer:\n".$response->as_string."\n");
}
else {
_report('ERROR',"ERROR: Service name: ".$opt_s."\nNotification type: ".$opt_f."\nRequest: ". $message."\n\nAnswer:\n".$response->error_as_HTML."\n");
}
}
sub get_attribute_value {
my ($attr_name, $elem_xml) = @_;
$elem_xml =~ s/.*$attr_name\s*=\s*\"(.*?)\".*/$1/s;
return $elem_xml;
}
sub get_state_elem {
my $xml = shift;
$xml =~ s/.*<spooler.*?>\s*<answer.*?>\s*(<state.*?>).*/$1/s;
return $xml;
}
sub print_help () {
print $0. "\n";
print "Copyright (c) 2015 SOS GmbH, info\@sos-berlin.com
This script tries to connect to given Job Scheduler
";
print_usage();
print "
-H, --hostname=HOST
Name or IP address of host to check
-p, --port=INTEGER
Port at host to check
-t, --timeout=INTEGER
Timeout for HTTP connetion
-f =STRING
Notification type, e.g. ACKNOWLEDGEMENT
-s, --service=STRING
Service name, e.g. JobScheduler Errors
-h, --help
This help
";
}
sub print_usage () {
print "Usage: $0 -H <host> -p <port> -f ACKNOWLEDGEMENT -s <service name> [-t <timeout>]\n";
}
sub _report {
print $_[1];
if (defined($ERRORS{$_[0]})) { exit $ERRORS{$_[0]}; }
else { exit 0; }
}
|
JobScheduler - Job Chains customization
The default name of the monitor system used in the configuration files and stored in the JobScheduler database is "MonitorSystem".
The default configuration can be changed to allow better customization of the monitoring systems used.
Example customization for the op5 system monitor:
<scheduler_install>/config/notification/SystemMonitorNotification_MonitorSystem.xml
- rename this file to
SystemMonitorNotification_op5.xml
- set
system_id
attribute toop5
e.g. <SystemMonitorNotification system_id="op5">
- rename this file to
<scheduler_install>/config/live/sos/notification/SystemNotifier,MonitorSystem.order.xml
- rename this file to
SystemNotifier,op5.order.xml
- set
system_configuration_file
attribute toSystemMonitorNotification_op5.xml
e.g.
<param name="system_configuration_file" value="config/notification/SystemMonitorNotification_op5.xml"/>
- rename this file to
-
<scheduler_install>/config/live/sos/notification/ResetNotifications,AcknowledgeMonitorSystem.order.xml
- rename this file to
ResetNotifications,Acknowledgeop5.order.xml
- set
system_id
op5
e.g.
<param name="system_id" value="op5"/>
- rename this file to
JobScheduler - Cluster
In case of Cluster Operation please modify the job_chain
element definition for all notification job chain files
- add
distributed="yes"
attribute.- e.g:
<job_chain distributed="yes" ...
- e.g:
- remove
orders_recoverable="no"
attribute if exists
Following job chain files must be modified in the notification directory
:<scheduler_install>/config/live/sos/notification/
CheckHistory.job_chain.xml
CleanupNotifications.job_chain.xml
ResetNotifications.job_chain.xml
SystemNotifier.job_chain.xml
Use Cases
Workflow Execution takes too long
Initial Situation
A Job Chain is triggered and it could not end, it hang in a step, taking longer than expected.
Problem
Execution time was too long
Handling
A timer for this Job Chain has been set and the System Monitor notified about it. The expiration times for the Job Chains are configured with enough time for processing. This is usually used for cases where the Job Chain could hang in a specific step.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Timer
- Configure SystemMonitorNotification / Notification / NotificationObjects / TimerRef
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- XML
CheckConfigurationHistory.xml
: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor. Moreover, specify the timer for this specific job chain and the function to calculate the expiration time for the timer. - XML
SystemMonitorNotification.xml
: As in the example above, specify the name of the Service (in the System Monitor) and specify that it is about aservice_name_on_error
since you want to have the control when the Job Chain ends in an error. Moreover and essential for this particular case, specify how many times the timer should notify your System Monitor about the expiration of a timer. - System Monitor: As in the example above,
- Services in the System Monitor have to be configured and named the same way as in
SystemMonitorNotification.xml
.- the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
SFTP connection refused
Initial Situation
Consider : There is a Job Chain that uses SFTP for transferring files. You have a setback configured in this step of the Job Chain, so that if the connection to the SFTP server fails, this step is retried after some a specified time.
Problem
: The SFTP server is not available anymore.
Handling
: The System Monitor will be notified to the service related to the Job Chain with the message error. However, you don't want to have a bunch of repeated notifications for a Job Chain when is an external factor, the connection to the SFTP Server, what is producing the error.
Configuration
...
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain.
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- XML
CheckConfigurationHistory.xml
: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor. - XML
SystemMonitorNotification.xml
: As in the example above, specify the name of the Service (in the System Monitor) and specify that it is about a- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_error
step_from
andstep_to
for that in order to reduce the number of notifications for this specific step.
Thresholds
Initial Situation
Consider the situation where a workflow has to be executed successfully a specific number of times before a specific point in time. This means that a specific value has to be monitored in order to determine if this quote was reached.
Handling
A new History service is configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they have been successfully executed to the System Monitor.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain
- Configure
service_name_on_success
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
System Monitor: As in the example above,- Services in the System Monitor have to be configured and named the same way as in
SystemMonitorNotification.xml
.
Thresholds
Initial Situation: For example, a specific number of Workflow Executions have to be executed successfully till some specific time. That is, a specific value has to be monitored in order to determine if this quote was reached.
Handling: A new service for History is configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they were executed and finished to the System Monitor.
Configuration:
- XML
CheckConfigurationHistory.xml
: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor. - XML
SystemMonitorNotification.xml
: Specify the name of the Service (in the System Monitor) but now specify that it is about aservice_name_on_success
since you want to have the control when the Job Chain ends in an success, and not only when it ends on error. - System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above
SystemMonitorNotification.xml
.
Acknowledgement
Initial Situation: An alert for a Service has been sent to the System Monitor and a Mail has been sent to the Service Desk (Support Team) notifying about it.
Handling: The problem is well known by the Service Desk and the "acknowledge" the problem. Through the acknowledgement JobScheduler will be notified to and will not send any more notification for this Service to the System Monitor till the Service is again recovered.
Configuration:
- System Monitor: The step of notifying JobScheduler through an acknowledgement in the System Monitor is an execution of a script. This is nothing else than a notification, like sending a mail for instance, but instead, another action is executed, which is the execution of the script that contacts JobScheduler and add an order to the JobChain
ResetNotifications
described above.
- the
service_name_on_success
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- the
Acknowledgment
Initial Situation
An alert for a Service has been sent to the System Monitor, which has sent a Mail to the Service Desk (Support Team) notifying them about the alert.
Handling
The problem is known to the Service Desk and they "acknowledge" the problem. The acknowledgment will cause the JobScheduler to be notified not to send any more notifications for this Service to the System Monitor until the Service has been recovered.
Configuration
System Monitor
- The JobScheduler is notified about the acknowledgment in the System Monitor by the execution of a script. See sos / notification / ResetNotifications
Recoverable Errors
Initial Situation
You have a setback configured in one of the steps of the Job Chain, so that if the step execution fails, this step is retried after a specified time.
Problem
The step has ended with an error, but recovered after setback
Handling
If the error message has been sent to the System Monitor, in case of error recovery JobScheduler will automatically sent the recovery message on the same service with the same error message and the prefix RECOVERED.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain.
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the
Change Management References
Jira | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|