Table of Contents

Scope

Resilience is about the availability and robustness of an architecture for a number of outage scenarios.
Master / Agent Availability includes a number of architecture decisions:
- Master Clusters provide redundancy of Master instances in a network.
- Agent Bundles can be used to compensate the outage of a server that runs an Agent.
Master / Agent Resilience includes a number of implicit and explicit measures for:
- Master / Agent Reconciliation allows continued execution of tasks in case of short-term Network Connection Loss.
- Master Service Recovery includes supported measures after a Master Service Failure.
- Database Service Recovery includes the capability to recover in case of Database Connection Loss and possible data loss.

Master / Agent Availability

Master Cluster

Feature

Excerpt Include

	High Availability
	High Availability
nopanel	true

Master Cluster

Feature

JobScheduler Master supports Cluster Operation with redundancy of the involved cluster members.
- see Passive Cluster
- see Active Cluster

Agent Bundles

- see Master / Agent Cluster
Clustering is frequently used for high-availability and in some cases for improved performance.

Agent Cluster

Image AddedImage Removed

Feature

The JobScheduler allows multiple Agents to be specified for a single Process Class.
- The JobScheduler contacts Agents in round-robin mode (JS-1188): Master contacts Agents
  - by fixed priority scheduling:
    - JobScheduler Master will select the first available Agent for execution of jobs.
    - If an Agent is not available then the next available Agent is selected from the Agent Cluster.
    - Jira
      server SOS JIRA
      columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
      serverId 6dc67751-9d67-34cd-985b-194a8cdc9602
      key JS-1554
  - by round-robin schedulung:
    - JobScheduler Master switches the Agent for each job execution.
    - If an Agent is not available then the next available Agent is selected from the Agent Cluster.
    - Jira
      server SOS JIRA
      columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
      serverId 6dc67751-9d67-34cd-985b-194a8cdc9602
      key JS-1188
  - If the designated
  - the first Agent that is configured to execute jobs for the process class will be contacted.
  - if the first Agent is not available then the next Agent listed in the process class configuration will be contactedthis . This procedure will be repeated until an Agent is found that can execute the job.
- Use cases for this scenario include
  - all Agents running on different server nodes: the switch to the next available Agent implements a fail-over to the next server node.
  - a number of Agents running on the same server node: the switch to the next available Agent implements redundancy of Agents within a single server node.
Feature Availability
- Display feature availability
  StartingFromRelease 1.9

Delimitation

This feature is not intended for load sharing as the JobScheduler will always use the first available Agent.
This feature is not intended for scalability as it does not allow the execution of jobs in parallel on a number of Agents (clustering).
Feature Availability
- Display feature availability
  StartingFromRelease 1.9

Implementation

Jira

server	SOS JIRA
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId	6dc67751-9d67-34cd-985b-194a8cdc9602
key	JS-1188

- - jobs that have been enqueued and
  - jobs that are scheduled for execution using start time events.
- All job starts that are delayed due to paused mode will be executed after the JobScheduler Master is continued.
  - This also applies to jobs that are enqueued while paused mode is active.
  - The operation to continue JobScheduler is available with JOC.
- Paused mode allows users to manually check the job history and optionally remove enqueued tasks if Agent Reconciliation has not taken place.
  - The Agent stores log files of jobs during execution. If an execution result cannot be reported to the Master then the log file will be retained, otherwise it will be removed (JS-1521).
  - Paused mode can be configured to be applied automatically in case of restart of a JobScheduler Master after failure (JS-1522).
Delimitation
- The currently supported measures include manual checking after failure.
- Automated recovery of the Master/Agent execution status after a Master Service Failure is subject to future improvements.
Feature Availability
- Display feature availability
  StartingFromRelease 1.10.2

Implementation

References

Change Management References

Jira

server	SOS JIRA
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues	20
jqlQuery	key labels in (JS-1526, JS-1522, JS-1521agent-cluster)
serverId	6dc67751-9d67-34cd-985b-194a8cdc9602

Space shortcuts

Page tree

Versions Compared

Old Version 2

New Version Current

Key

Scope

Master / Agent Availability

Master Cluster

Feature

Master Cluster

Feature

Agent Bundles

Agent Cluster

Feature

Delimitation

Implementation

Implementation

References

Change Management References

See also

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 2

New Version Current

Key

Scope

Master / Agent Availability

Master Cluster

Feature

Master Cluster

Feature

Agent Bundles

Agent Cluster

Feature

Delimitation

Implementation

Implementation

References

Change Management References

See also