Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • JobScheduler: The architecture establishes a partition between:
    • Detecting errors: A Job Chain analyses the JobScheduler logging and checks whether the monitored Job Scheduler objects had errors or warnings.
    • Sending alerts: Another Job Chain is responsible for sending the alerts to the corresponding System Monitor. The difference here, is that not all alerts are only incidents, but also events, as in occurrences, for example, the alert that a specific Job Chain was executed and which result it ended up with.
  • JobScheduler: This architecture allows to analyze analyse the Log History of more than one JobScheduler.
  • System Monitor: JobScheduler is able to connect to more than one System Monitor at the same time.

...

Use Cases

Recoverable Errors

Initial Situation: A Job Chain is triggered by directory monitoring. That is, when a certain file comes in a monitored folder, the Job Chain starts.

Problem: The Job Chain ended with error.

Handling: The System Monitor will be notified to the service related to the Job Chain with the message error. If a new execution of the Job Chain from a new file end without errors, does not mean that the error is recovered, since the file that has been processed is now another one. That is, the error message at the System Monitor will stay till the same file is again placed in the monitored directory and the Job Chain ends without errors.

...

Workflow Execution takes too long

Initial Situation: A Job Chain is triggered and it could not end, it hanged in a step, taking then longer than expected.

Problem: Execution time was too long

Handling: A timer for this Job Chain is set and the System Monitor will be notified about it. The expiration times for the Job Chains are configured with enough time for processing, that means, this is usually used for cases where the Job Chain hanged in a specific step.

...

  • XML CheckConfigurationHistory.xml: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor. Moreover, specify the timer for this specific job chain and the function to calculate the expiration time for the timer.
  • XML SystemMonitorNotification.xml: As in the example above, specify the name of the Service (in the System Monitor) and specify that it is about a service_name_on_error since you want to have the control when the Job Chain ends in an error. Moreover and essential for this particular case, specify how many times the timer should notify your System Monitor about the expiration of a timer.
  • System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

SFTP connection refused

Initial Situation: There is a Job Chain that uses SFTP for transferring files. You have a setback configured in this step of the Job Chain, so that if the connection to the SFTP server fails, this step is retried after some time.

Problem: The SFTP server is not available anymore.

Handling: The System Monitor will be notified to the service related to the Job Chain with the message error. However, you don't want to have a bunch of notifications for a Job Chain when is an external factor, the connection to the SFTP Server, what is producing the error.

...

  • XML CheckConfigurationHistory.xml: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor.
  • XML SystemMonitorNotification.xml: As in the example above, specify the name of the Service (in the System Monitor) and specify that it is about a service_name_on_error since you want to have the control when the Job Chain ends in an error. Moreover and very important in this case, specify how many times this Job Chain should notify your System Monitor about the error connecting to the SFTP Server. You can use step_from andstep_to for that in order to reduce the number of notifications for this specific step.
  • System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

Thresholds

Initial Situation: For example, a specific number of Workflow Executions have to be executed successfully till some specific time. That is, a specific value has to be monitored in order to determine if this quote was reached.

Handling: A new service for History is configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they were executed and finished to the System Monitor.

...

  • XML CheckConfigurationHistory.xml: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor.
  • XML SystemMonitorNotification.xml: Specify the name of the Service (in the System Monitor) but now specify that it is about a service_name_on_success since you want to have the control when the Job Chain ends in an success, and not only when it ends on error.
  • System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

Acknowledgement

Initial Situation: An alert for a Service has been sent to the System Monitor and a Mail has been sent to the Service Desk (Support Team) notifying about it.

Handling: The problem is well known by the Service Desk and the "acknowledge" the problem. Through the acknowledgement JobScheduler will be notified to and will not send any more notification for this Service to the System Monitor till the Service is again recovered.

...