Scope
- Availability and Resilience are about the robustness of an architecture for a number of outage scenarios.
- Master / Agent Availability includes a number of architecture decisions:
- Master Clusters provide redundancy of Master instances in a network.
- Agent Bundles can be used to compensate the outage of a server that runs an Agent.
- Master / Agent Resilience includes a number of implicit and explicit measures for:
- Master / Agent Reconciliation allows continued execution of tasks in case of short-term Network Connection Loss.
- Master Service Recovery includes supported measures after a Master Service Failure.
- Database Service Recovery includes the capability to recover in case of Database Connection Loss.
Master / Agent Availability
Master Cluster
Feature
- see Passive Cluster
- see Active Cluster
Agent Bundles
Feature
- JobScheduler allows multiple Agents to be specified for a single Process Class.
- JobScheduler contacts Agents in round-robin mode (JS-1188):
- the first Agent that is configured to execute jobs for the process class will be contacted.
- if the first Agent is not available then the next Agent listed in the process class configuration will be contacted
- this procedure will be repeated until an Agent is found that can execute the job.
- Use cases for this scenario include
- all Agents running on different server nodes: the switch to the next available Agent implements a fail-over to the next server node.
- a number of Agents running on the same server node: the switch to the next available Agent implements redundancy of Agents within a single server node.
- JobScheduler contacts Agents in round-robin mode (JS-1188):
- Delimitation
- This feature is not intended for load sharing as the JobScheduler will always use the first available Agent.
- This feature is not intended for scalability as it does not allow the execution of jobs in parallel on a number of Agents (clustering).
- Feature Availability
- FEATURE AVAILABILITY STARTING FROM RELEASE 1.9
Implementation