...
<scheduler_install>/config/notification/SystemMonitorNotification_MonitorSystem.xml
- rename this file to
SystemMonitorNotification_OP5.xml
- set
system_id
Attribute toOP5
e.g. <SystemMonitorNotification system_id="OP5">
- rename this file to
<scheduler_install>/config/live/sos/notification/SystemNotifier,MonitorSystem.order.xml
- rename this file to
SystemNotifier,OP5.order.xml
- set
system_configuration_file
Attribute toSystemMonitorNotification_OP5.xml
e.g.
<param name="system_configuration_file" value="config/notification/SystemMonitorNotification_OP5.xml"/>
- rename this file to
-
<scheduler_install>/config/live/sos/notification/ResetNotifications,AcknowledgeMonitorSystem.order.xml
- rename this file to
ResetNotifications,AcknowledgeOP5.order.xml
- set
system_id
OP5
e.g.
<param name="system_id" value="OP5"/>
- rename this file to
Status colour Yellow title work in progress
Use Cases
...
Workflow Execution takes too long
Initial Situation
: A Job Chain is triggered by directory monitoring - i.e. the Job Chain starts when a certain file arrives in a monitored folder.
Problem: The Job Chain has ended with an error.
Handling: The System Monitor will be notified with the error message via the service specified for the Job Chain. If the Job Chain is then restarted by the arrival of a new file end and ends without an error, this does not mean that the original error has been recovered, since the second run has involved the processing of a different file. Instead, the error message at the System Monitor should remain unchanged until the original file has been re-added to the monitored directory and the Job Chain has ended without an error.
Configuration:
- XML
CheckConfigurationHistory.xml
: Indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor. - XML
SystemMonitorNotification.xml
: Specifies the name of the Service (in the System Monitor) and specifies that it is about aservice_name_on_error
since you want to have the control when the Job Chain ends in an error. - System Monitor: Services in the System Monitor have to be configured and named the same way as in the
SystemMonitorNotification.xml
XML file above.
Workflow Execution takes too long
Initial Situation: A Job Chain is triggered and it could not end, it hang in a step, taking longer than expected.
Problem: Execution time was too long
Handling: A timer for this Job Chain has been set and the System Monitor notified about it. The expiration times for the Job Chains are configured with enough time for processing. This is usually used for cases where the Job Chain could hang in a specific step.
Configuration:
and it could not end, it hang in a step, taking longer than expected.
Problem
Execution time was too long
Handling
A timer for this Job Chain has been set and the System Monitor notified about it. The expiration times for the Job Chains are configured with enough time for processing. This is usually used for cases where the Job Chain could hang in a specific step.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Timer
- Configure SystemMonitorNotification / Notification / NotificationObjects / Timer
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- XML
CheckConfigurationHistory.xml
: As in the example above - indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor. In addition, the timer for this specific job chain and the function for calculating the expiration time for the timer should be specified. - XML
SystemMonitorNotification.xml
: As in the example above - specifies the name of the Service (in the System Monitor) and that it is about aservice_name_on_error
since you want to have the control if the Job Chain ends with an error. It is essential for this particular case that the number of times the timer should notify your System Monitor about the expiration of a timer should be specified. - System Monitor: As in the example above -
- Services in the System Monitor have to be configured and named the same way as in
SystemMonitorNotification.xml
file- the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
SFTP connection refused
Initial Situation
: Consider a Job Chain that uses SFTP for transferring files. You have a setback configured in this step of the Job Chain, so that if the connection to the SFTP server fails, this step is retried after a specified time.
Problem
: The SFTP server is not available anymore.
Handling
: The System Monitor will be notified to the service related to the Job Chain with the message error. However, you don't want to have repeated notifications for a Job Chain when is an external factor, the connection to the SFTP Server, is producing the error.
Configuration:
, you don't want to have repeated notifications for a Job Chain when is an external factor, the connection to the SFTP Server, is producing the error.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain.
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- XML
CheckConfigurationHistory.xml
: As in the example above - indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor. - XML
SystemMonitorNotification.xml
: As in the example above - specifies the name of the Service (in the System Monitor) and that it is about aservice_name_on_error
as you want to have the control if the Job Chain ends in error. Note that it is very important in this case that the number of times this Job Chain should notify your System Monitor about the error connecting to the SFTP Server is specified. You can usestep_from
andstep_to
for this in order to reduce the number of notifications for this specific step. - System Monitor: As in the example above -
- Services in the System Monitor have to be configured and named the same way as in
SystemMonitorNotification.xml
file- the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
Thresholds
Initial Situation
: Consider the situation where a workflow has to be executed successfully a specific number of times before a specific point in time. This means that a specific value has to be monitored in order to determine if this quote was reached.
Handling
: A new History service is configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they have been successfully executed to the System Monitor.
Configuration:
- XML
CheckConfigurationHistory.xml
: As in the example above - indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor. - XML
SystemMonitorNotification.xml
: Specifies the name of the Service (in the System Monitor) but note that here it is about aservice_name_on_success
since you want to have the control when the Job Chain ends in an success, and not only when it ends on error. - System Monitor: As in the example above - Services in the System Monitor have to be configured and named the same way as in the
SystemMonitorNotification.xml
file above.
Acknowledgment
configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they have been successfully executed to the System Monitor.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain
- Configure
service_name_on_success
(SystemMonitorNotification / Notification / NotificationMonitor)
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_success
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the
Acknowledgment
Initial Situation
An alert for a Service has been sent to the System Monitor, which has sent a Mail to the Service Desk (Support Team) notifying them about the alert.
Handling
The problem is known to the Service Desk and they "acknowledge" the problem. The acknowledgment will cause the JobScheduler to be notified not to send any more notifications for this Service to the System Monitor until the Service has been recovered.
Configuration
System Monitor
- The JobScheduler is notified about the acknowledgment in the System Monitor by the execution of a script. See sos / notification / ResetNotifications
Recoverable Errors
Initial Situation
You have a setback configured in this step of the Job Chain, so that if the step execution fails, this step is retried after a specified time.
Problem
The step has ended with an error, but recovered after setback
Handling
If the error message Initial Situation: An alert for a Service has been sent to the System Monitor, which has sent a Mail to the Service Desk (Support Team) notifying them about the alert.
Handling: The problem is known to the Service Desk and they "acknowledge" the problem. The acknowledgment will cause the JobScheduler to be notified not to send any more notifications for this Service to the System Monitor until the Service has been recovered.
Configuration:
...
in case of error recovery JobScheduler will automatically sent the recovery message on the same service and the same error message with the prefix RECOVERY.
Configuration
SystemMonitorNotification_<MonitorSystem>.xm
l- Configure SystemMonitorNotification / Notification / NotificationObjects / JobChain for relevant Job chain.
- Configure
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor)- Recovery message will be sent on this service
System Monitor
- Services in the System Monitor have to be configured and named the same way as in the
service_name_on_error
(SystemMonitorNotification / Notification / NotificationMonitor) above.
- Services in the System Monitor have to be configured and named the same way as in the