Introduction
The Agent Cluster brings horizontal scalability and fail-over capabilities for Agents without a single point of failure.
Use of an Agent Cluster is subject to the JS7 - License.
We find separate tiers in the architecture of Agent Clusters, see JS7 - Cluster Architecture:
- Controller (Cluster) → Director Agent (Cluster)
- Director Agent (Cluster) → Subagent Cluster
The Director Agent as a separate tier provides better autonomy for Agent Clusters as it can be operated closer to its Subagents in the network and reduces dependency on immediate availability of a Controller.
We find separate layers for operation and use of Agent Clusters:
- Operational Layer: Subagents and Director Agent Instances
- Subagents and Director Agent instances are similarly installed.
- Director Agent instances orchestrate Subagents. They include a Subagent that can be used if users wish to execute jobs from a Director Agent.
- Functional Layer: Subagent Cluster and Director Agent Cluster
- Jobs are assigned Subagent Clusters to specify that the jobs can be executed by any Subagent that is a member of the Subagent Cluster. The Subagent Cluster rules if a different Subagent will be chosen in case of fail-over only (fixed-priority scheduling, active-passive cluster) or for each next execution of a job (round-robin, active-active cluster).
- The Director Agent Cluster is independent from Subagent Clusters. The purpose of clustering is to provide high availability for the role of orchestrating Subagents.
Consider the wording in this article:
- Fail-over is an automated operation that occurs when a Subagent or Director Agent instance is forcibly terminated or crashed. Fail-over is applied in case of abnormal termination.
- Switch-over is a manual operation performed by users disabling/enabling Subagents or switching the active role of Director Agent instances.
For fail-over and switch-over scenarios
- see JS7 - How to fail-over and switch-over between Director Agent instances
- see JS7 - How to fail-over between Subagents in an Agent Cluster
For instructions about how to set up Agent Clusters see the JS7 - Management of Agent Clusters article.
Development Status:
- Subagent Cluster:, see
-
JS-1954Getting issue details...
STATUS
FEATURE AVAILABILITY STARTING FROM RELEASE 2.3.0 - Director Agent Cluster see
-
JS-1955Getting issue details...
STATUS
FEATURE AVAILABILITY STARTING FROM RELEASE 2.6.0 - Director Agent Cluster see
-
JS-2075Getting issue details...
STATUS
,
- JS-2076Getting issue details... STATUS ,
- JS-2074Getting issue details... STATUS
FEATURE AVAILABILITY STARTING FROM RELEASE 2.6.1
Architecture
The Agent Cluster fits into the JS7 architecture like this, see JS7 - Cluster Architecture:
Operational Layer
The architecture applies to an Agent Cluster including the clustering of Director Agents and the clustering of Subagents for high-availability and scalability purposes.
- The components involved in an Agent Cluster are considered by the Controller to be a single Agent.
- A Controller can manage any number of Agents from different Agent Clusters or as a number of Standalone Agents.
Director Agent
The Director Agent ships with an Integrated Subagent which runs inside the Director Agent.
- The Subagent inside the Director can be used to execute jobs.
- The Director's Subagent can be disabled in order to have the Director only perform orchestration and not execute jobs with its Subagent (recommended).
The Director Agent is contacted by the Controller for deployment of scheduling objects, for commands that change an order's state such as suspend, resume and cancel, and for reporting back the results of executing JS7 - Workflow Instructions if the Director Agent is not available.
The Director Agent can be used for clustered JS7 - File Watching by assigning a File Order Source the Agent Cluster. The active Director Agent instance will pick up file watching including situations when the active role switches as in case of fail-over and switch-over.
Director Journal
The Director Agent holds a journal for storing:
- scheduling objects such as workflows and jobs that are deployed via a Controller,
- events about JS7 - Order State Transitions and log output reported from Subagents executing jobs.
The journal is essential for the restarting capabilities of the Director Agent.
Director Agent Cluster
The Director Agent can be operated as a single instance and from an active-passive Director Agent Cluster of two Director Agent instances.
In a Director Agent Cluster one Director Agent instance holds the active role and the second instance holds the standby role.
If the Director Agent is operated as a cluster then the active instance synchronizes its journal with the standby instance. If the journals of both active and standby Director Agent instances are in sync then in case of fail-over (automatically) or switch-over (by user intervention) the Director Agent instances will switch the active role.
During fail-over and switch-over of Director Agent instances any jobs running on related Subagents are continued.
Subagent
Subagents come in two shapes: they can be operated as individual instances and they ship inside a Director Agent.
- Subagents are lightweight and do not operate a journal.
- Subagents are deployed to any number of servers in a network.
- Subagents are closely monitored by the Director Agent using bi-directional heartbeats.
Job Execution
Subagents execute jobs and report back execution results.
- Subagents execute jobs on behalf of the Director Agent.
- They report back log output and the execution results of jobs to the Director Agent.
- If a Subagent fails then the Director Agent can hand-over the job execution request to the next Subagent.
- A Subagent terminates running jobs if the connection from the Director Agent is lost or if it is instructed by the Director Agent to suspend/force or to cancel/force an order.
- For details see JS7 - FAQ - How does JobScheduler terminate Jobs
- This behavior is intended to prevent double job execution by more than one Subagent.
- If the connection from the Director Agent can be re-established within a given timeout then the Subagent will continue to execute jobs.
Subagent Cluster
Subagents can be grouped into clusters.
- A Subagent Cluster is specified by a Selection and Scheduling Mode:
- The Subagent cluster can include a single Subagent, a number of Subagents or all Subagents.
- The Scheduling Mode is one of fixed-priority, round-robin or metrics-based.
- Any number of Subagent Clusters can be configured reusing the same Subagents.
Functional Layer
Jobs in workflows are assigned a Subagent Cluster that includes a Selection and Scheduling Mode of Subagents:
Subagent Cluster
Jobs in workflows are assigned a Subagent Cluster that includes a Selection and Scheduling Mode of Subagents:
- Subagent Clusters present a logical view of the way a given number of Subagents co-operate for job execution.
- Any number of Subagent Clusters can be configured using the same Subagents.
- The Selection makes use of one or more Subagents.
- Subagents are used for job execution according to their ordering in the Selection.
- A Subagent can be a member in one or more Subagent Clusters.
- The configuration of Subagent Clusters is performed using the JOC Cockpit and is forwarded to the Controller and to the active Director Agent.
- The Scheduling Mode is one of:
- fixed-priority: execute jobs with the first Subagent and switch to the next Subagent only if the first Subagent becomes unavailable (active-passive clustering).
- For details see JS7 - Agent Cluster - Active-Passive Subagent Cluster.
- round-robin: execute each next job on the next Subagent (active-active clustering).
- For details see JS7 - Agent Cluster - Active-Active Subagent Cluster.
- metrics-based: execute each next job on the Subagent that best matches metrics such as number of parallel tasks, CPU and memory consumption.
- For dtails see JS7 - Agent Cluster - Metrics-based Subagent Cluster.
- fixed-priority: execute jobs with the first Subagent and switch to the next Subagent only if the first Subagent becomes unavailable (active-passive clustering).
Network Connections
Network connections use the HTTP protocol and can be secured using TLS/SSL certificates.
Connections are unidirectional and are established:
- by the active Controller instance to the Director Agent instances,
- by the active Director Agent instance to the Subagents.
Workflow Execution
Workflows are deployed from the JOC Cockpit to a Controller and are forwarded to the Director Agent. Similarly, orders are submitted to a Controller and to the relevant Director Agent.
Orders are scheduled for a given date and time at which the Director Agent will start the workflow.
- The Director Agent will execute a number of JS7 - Workflow Instructions, for example Retry, Try/Catch, Fork, which are in the scope of the Director Agent.
- The Director Agent will request execution of jobs as available from the JS7 - Job Instruction with a Subagent. When choosing the Subagent the Director Agent considers the Selection of Subagents and Scheduling Mode of the Subagent Cluster that is assigned the job. For example, a specific Subagent will be selected for a fixed-priority Scheduling Mode.
- The Director Agent will hand back the order to the Controller if it meets an instruction that is out of the scope of a single Agent, for example in case of a JS7 - Lock Instruction.
Resources
- JS7 - Management of Agent Clusters
- JS7 - System Architecture
- JS7 - Controller Cluster
- JS7 - JOC Cockpit Cluster
- Fail-over and Switch-over Scenarios