Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • This feature is intended to prevent simultaneous duplicate execution of jobs, it is not intended to prevent any consecutive duplicate execution of jobs.
    • If a task is completed within the period that is implied with the retry attempts to establish the connection then this will lead to consecutive duplicate execution as the Master will request the task to be re-executed. However, this scenario applies to jobs only that are running for less than 5s.
    • We recommend that your job scripts are designed to be aware of duplicate execution.
  • This feature covers the situation of a short-term Network Connection Loss, not of an on-going network outage.
    • A connection loss is recovered by repeated attempts to re-connect. 
    • An on-going network outage would require the Agent to work autonomously which is not in scope of this feature.
  • This feature is not intended to support a Master Service Failure scenario or Database Connection Loss scenario.

...

Change Management References

Jira
serverSOS JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQuerykey in (JS-1523, JS-1524)
serverId6dc67751-9d67-34cd-985b-194a8cdc9602

...

  • The currently supported measures include manual checking of Agent task logs after failure. 
    • The execution history of jobs that completed on an Agent during the Master Service Failure period is not reported back to the Master.
    • The Agent will kill running tasks after expiration of the Network Connection Loss scenario. Therefore it is recommended to check the Agent tasks logs for successful or unsuccessful execution of jobs.
  • Automated recovery of the Master/Agent execution status after a Master Service Failure will be subject to future improvements.

...

Change Management References

Jira
serverSOS JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQuerykey in (JS-1526, JS-1522, JS-1521)
serverId6dc67751-9d67-34cd-985b-194a8cdc9602

...

  • The capability to re-connect to a database does not imply that JobScheduler will cope with data loss, in fact JobScheduler relies on the job history and job-related status information being consistent and available with the database.
  • For use with replicated databases keep in mind that the delay that is caused by replication can result in data loss. 
    • Depending on the DBMS this delay might be short, however, it might result in duplicate execution of jobs if the information about a previous job run is not available with the replicated database in case of fail-over.
    • To our knowledge replicated databases are frequently used to achieve a database availability of up to approx. 99.9%.
  • For use with clustered databases JobScheduler does not rely on vendor-specific connection continuity mechanisms but complies with JDBC standards and will always re-connect after connection loss or occurrence of a failed transaction.
    • Unsupported vendor-specific mechanisms include e.g. SQL Server® multi-subnet clustering or MySQL® with Galera® JDBC fail-over that expect the client to switch the connection to some different address.
    • In case of fail-over the clustered database is expected to be available with the same connection attributes, e.g. hostname, port. This can include mechanisms as e.g. DNS switching to make a different database server the primary server in case of fail-over.
    • To our knowledge clustered databases are frequently used to achieve a database availability of up to approx. 99.999%.

...

Change Management References

Jira
serverSOS JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQuerykey in (JS-1283,JS-951,JS-1032,JS-1082,JS-1157)
serverId6dc67751-9d67-34cd-985b-194a8cdc9602

...