Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Use of a JS7 - Agent Clusteris subject to the JS7 - License.

We find two separate tiers for clustering of Agentsin the architecture of Agent Clusters, see JS7 - System Architecture:

  • Controller (Cluster) → Director Agent (Cluster)
  • Director Agent (Cluster) → Subagent Cluster

We find the following layers in the architectureseparate layers for operation and use of Agent Clusters:

  • Operational Layer: Subagents and Director Agent Instances
    • Subagents and Director Agent instances are similarly installed.
    • Director Agent instances orchestrate Subagents. They include a Subagent that can be used if users wish to execute jobs from a Director Agent.
  • Functional Layer: Subagent Cluster and Director Agent Cluster
    • Jobs are assigned Subagent Clusters to specify that the jobs can be executed by any Subagent that is a member of the Subagent Cluster. The Subagent Cluster rules if a different Subagent will be chosen in case of fail-over only (fixed-priority scheduling, active-passive cluster) or for each next execution of a job (round-robin, active-active cluster).
    • The Director Agent Cluster is independent from Subagent Clusters. The purpose of clustering is to provide high availability for the role of orchestrating Subagents.

...

To check the Agent Cluster status users can navigate to the Resources->Agents view:

High Availability Setup

For high availability setup with two server nodes the following distribution of active and standby JS7 products should be applied:

Server 1Server 2
Active Controller Instance

Standby Controller Instance

Standby Active Director InstanceActive Director Agent Instance

Operations on Director Agent Cluster

...

Fail-over occurs when an Active Director Agent instance is terminated abnormally. Fail-over includes that the task any tasks currently being executed by the Director Agent instance is are considered to have failed and that the related order is orders are set to a the failed state. An Inactive Director Agent instance is no longer a member of the Director Agent Cluster:

  • The previous Standby Director Agent instance will take the active role.
  • Subagent Clusters will continue to execute jobs. They are not affected by a Director Agent's fail-over operation.
  • If the Agent Cluster is assigned to a File Order Source for JS7 - File Watching then the active Director Agent instance will pick up file watching. This is performed independently from the fact that the Subagent included with a Director Agent instance is enabled or disabled.

Fail-over can be caused by the following actions:

...

  • the Active Director Agent instance is stopped normally from the command line:
    • agent_<port>.sh | .cmd stop
  • the operating system is shut down and systemd / init.d or a Windows Service are in place to stop the Director Agent instance normally.
  • no Active Controller instance is running as it holds the Cluster Watch role required for fail-over.

Fail-over happens within a short period of time, typically in 2-3s.

...

  • The active and standby Director Agent instances will switch roles.
  • As a prerequisite for switch-over
    • an active Controller instance has to be up and running.
    • the Director Agent Cluster has to be coupled,
    • the Subagent in a Director Agent instance must not have running run jobs.
  • After switch-over the Standby Director Agent will become active and the the previously active Director Agent instance will be restarted.
  • If the Agent Cluster is assigned to a File Order Source for JS7 - File Watching then the active Director Agent instance will pick up file watching. 
  • This is performed independently from the fact that the Subagent included with a Director Agent instance is enabled or disabled.

Confirm loss of a Director Agent instance

...

  • Assume that fail-over between Director Agent instances occurred. Assume that after fail-over both the Controller (Standalone Controller or Controller Cluster) and the remaining Director Agent instance are shutdown at the same point time. In this situation after restart of Controller and Director Agent the Controller cannot act as a witness to the previous Director Agent fail-over due to its own restart. As a result the Controller holding the role of the Cluster Watch cannot determine which of the newly started Director Agent instances should receive the active role as both Director Agent instances after restart will claim the active role.
  • In this situation the user is asked to decide which Director Agent instance should be considered lost. This includes to verify that the now standby Director Agent instance is shutdown at the point in time when the user takes this decision. Users can start the now standby Director Agent instance later on to re-establish the Director Agent Cluster.

...