...
- Reconciliation Scenario
- Applies after a Network Connection Loss between Master and Agent.
- Includes 5 attempts to establish the normal relationship between Master and Agent after a connection loss. A delay of less than 1s is assumed between retry attempts.
- Agent Behavior
- By default an Agent will kill any running tasks if the connection to the Master gets lost, i.e. the above scenario is not supported. The reasons for this include:
The reasons for this are:Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1523 - If a Master were not available for a longer period then the Agent could not report back the execution history and log information for tasks. This would result in the fact that no information is available with the Master if the job execution has been successful or not.
- The primary goal is to prevent duplicate simultaneous execution of jobs. Without further information from a Master the respective Agent instance cannot know if later on it will be contacted for re-execution of the same job (which would allow to continue a currently running task on an Agent) or if the Master will choose a different Agent (see Redundancy, Agent Bundle).
- With a Network Connection Loss setting configured with the Agent's process class the Agent will show the following behavior (:
):Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1524 - For the number of times specified for tolerated unsuccessful connection attempts the Agent will assume the Network Connection Loss scenario.
- The Agent will continue any running tasks up to the specified number of retry attempts to establish the connection with the Master.
- Reconciliation will take place if the connection between Master and Agent can be established within the number of retries and if the Master has not been restarted.
- Otherwise the Agent will assume the Master Service Failure scenario and will kill any running tasks.
- This behavior applies to tasks that are executed by an Agent for a specific Master to which a connection has been lost. Tasks for other JobScheduler Master instances will be continued.
- By default an Agent will kill any running tasks if the connection to the Master gets lost, i.e. the above scenario is not supported. The reasons for this include:
- Master/Agent Reconciliation
- After connection loss the Master will regularly attempt to re-establish the HTTP connection to the Agent. This communication allows the Agent to report the execution status of running jobs back to the Master.
- After a successful re-connect within the Network Connection Loss scenario the Master will repeat its request for execution of the respective jobs. Each new request includes an identifier for the previous execution request that allows the Agent to identify repeated requests:
- for a job that has been completed within the time required to re-establish the connection the Agent will report the execution result back to the Master and will not re-execute the job.
- for a job that is still running the Agent will report the appropriate information back to the Master which will note the running tasks and update JOC accordingly.
- Feature Availability
Display feature availability StartingFromRelease 1.10.2
...
- The JobScheduler Master can be be configured to start in paused mode (JS-1522) after a Master Service Failure.
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1522 - The paused mode (JS-1511) prevents all jobs from being started.
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1511
This applies to- jobs that have previously been requested for execution with Agents,
- jobs that have been enqueued and
- jobs that are scheduled for execution using start time events, file events or external events.
- All job starts that are delayed due to paused mode will be executed after the JobScheduler Master is continued.
- This also applies to jobs that are enqueued while paused mode is active.
- The operation to continue JobScheduler is available with JOC.
- Paused mode allows users to manually check the job history and optionally remove enqueued tasks if Agent Reconciliation has not taken place.
- The Agent stores log files of jobs during execution. If an execution result cannot be reported to the Master then the log file will be retained, otherwise it will be removed (.
).Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1521 - Paused mode can be configured to be applied automatically in case of restart of a JobScheduler Master after failure (JS-1522).
- The Agent stores log files of jobs during execution. If an execution result cannot be reported to the Master then the log file will be retained, otherwise it will be removed (.
- The paused mode (JS-1511) prevents all jobs from being started.
- Feature Availability
Display feature availability StartingFromRelease 1.10.2
...
- If a transaction failure occurs the JobScheduler Master will try to rollback the transaction and will disconnect from the database.
- If the connection to a database gets lost or if a transaction failure occurs then the JobScheduler Master will try to re-connect every 60s.
- A Master single instance can be configured to repeat an unlimited number of connection attempts.
- A Master Active Cluster member requires the database connection to become available within less than 120s. Otherwise the cluster member terminates in order to prevent duplicate execution of jobs in the cluster
- In case of Database Connection Loss the JobScheduler Master will switch to paused mode (JS-1511), i.e. any execution of new tasks will be postponed.
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1511 - If the connection to the database can be re-established within the same JobScheduler session then all postponed tasks will be executed immediately.
- If the connection to the database is established after a JobScheduler restart then
- previously enqueued tasks will be executed immediately.
- start times for scheduled tasks will be re-calculated, i.e. tasks that have been scheduled for the period in which the JobScheduler was not active will not be executed.
...