Introduction

In a JS7 - Agent Cluster jobs in workflows are assigned a Subagent Cluster that includes a Selection and Scheduling Mode of Subagents: 

  • Subagent Clusters present a logical view of the way a given number of Subagents co-operate for job execution. 
  • The Selection makes use of one or more Subagents.
  • The Scheduling Mode is one of:
    • active-passive: execute jobs with the first Subagent and switch to the next Subagent only if the first Subagent becomes unavailable (fixed-priority clustering).
    • active-active: execute each next job on the next Subagent (round-robin clustering).
    • metrics-based: execute each next job on the Subagent that best matches metrics such as number of parallel tasks, CPU and memory consumption.

FEATURE AVAILABILITY STARTING FROM RELEASE 2.7.2

Metrics-based Subagent Cluster

A Subagent Cluster can include a single Subagent, a number of Subagents or all Subagents. Each next task is executed on the Subagent that best matches metrics defined per Subagent in a Cluster.

Director Agents in a Subagent Cluster can take an active part in job execution. Users are free to add Director Agents to a given Subagent Cluster. A minimum metrics-based Subagent Cluster includes 2 Director Agents for job execution.

Metrics

Metrics are used to identify the matching Subagent for execution of the next jobs. Metrics include to evaluate indicator values such as the number of processes, CPU load and memory consumption to decide which Subagent best matches for execution of the next job.

Indicators

  • Director Agent
    • The Director Agent receives indicator values from Subagents at an ongoing basis approx. every second.
    • Additional indicators are provided by the Director Agent:
      • number of jobs running in a Subagent Cluster.
      • number of jobs running per Subagent in a Subagent Cluster
    • The Director Agent will check indicators provided by Subagents to decide which Subagent should execute the next job. If a Subagent does not provide indicators then it is not considered for execution of the next job.
  • Subagent Cluster
    • The Subagent Cluster configuration allows specification of an expression for the priority attribute of each Subagent:
      • An expression language is available that offers algebraic functions to be used for the priority attribute.
    • Indicators
      • Indicators are made available from variables that can be used in a priority expression.

Indicator VariableMetric
$js7SubagentProcessCountNumber of processes running with the Subagent.
$js7ClusterSubagentProcessCountNumber of processes for the given Subagent Cluster running with the Subagent.


The following indicators are available as explained with https://docs.oracle.com/en/java/javase/17/docs/api/jdk.management/com/sun/management/OperatingSystemMXBean.html

$js7CpuLoad

Returns the "recent cpu usage" for the operating environment. This value is a double in the [0.0,1.0] interval. A value of 0.0 means that all CPUs were idle during the recent period of time observed, while a value of 1.0 means that all CPUs were actively running 100% of the time during the recent period being observed. All values betweens 0.0 and 1.0 are possible depending of the activities going on. If the recent cpu usage is not available, the method returns a negative value.

A negative value is reported as missing. CPU load is not available for MacOS and is reported as missing.

$js7CommittedVirtualMemorySize

Returns the amount of virtual memory that is guaranteed to be available to the running process in bytes, or -1 if this operation is not supported.

A negative value is reported as missing.

$js7FreeMemorySize

Returns the amount of free memory in bytes.

Returns the amount of free memory.

$js7TotalMemorySize

Returns the total amount of memory in bytes.

Returns the total amount of memory

Expressions

  • Expressions are evaluated per Subagent & job and must return numeric values. Expressions can return a missing value indicating that the related Subagent should not be considered.
  • The Subagent with the highest priority value including negative numbers will be used for the next task.
  • If the expression evaluates to missing values for all Subagents in the Subagent Cluster, then job execution will be deferred and evaluation of the expression will be repeatedly performed with a 1s interval. Orders waiting for a Subagent to become available for job execution are displayed with a waiting state.

Examples:

ExpressionMeaning
-$js7SubagentProcessCountExample for consideration of the Subagent with the least number of tasks running for any Subagent Cluster that the Subagent is a member of.
-$js7ClusterSubagentProcessCountExample for consideration of the Subagent with the least number of tasks running for the given Subagent Cluster.
if $js7SubagentProcessCount == 0 then 1 else missingExample for conditional use of a Subagent if it does not run any tasks.
if $js7ClusterSubagentProcessCount == 0 then 1 else missingExample for conditional use of a Subagent if it does not run any tasks in the given Subagent Cluster
if $js7SubagentProcessCount < 10 then -$js7SubagentProcessCount else missingExample for conditional use of a Subagent if it runs fewer than 10 tasks.
if $js7ClusterSubagentProcessCount < 10 then -$js7ClusterSubagentProcessCount else missingExample for conditional use of a Subagent if it runs fewer than 10 tasks for the given Subagent Cluster.

-$js7CpuLoad * 2 + $js7FreeMemorySize / 1000000000 - $js7SubagentProcessCount * 3

Example for least consumption of CPU weighted by 2, free Memory and the least overall number of tasks running in the Subagent weighted by 3.

Error Handling

The following applies to unavailable Subagents:

  • If a Subagent is unreachable (shutdown or crashed) then it is not considered for job execution. No error is raised, but the next Subagent matching metrics will be assigned the job.
  • On normal termination the jobs in a Subagent will complete normally, no further jobs are accepted for execution.
  • If the Subagent is crashed then running jobs will fail and orders for running jobs will be set to the failed state. Such jobs are restarted when restarting the Subagent. Alternatively, jobs can be restarted from a next Subagent when the crashed Subagent is reset. For details see JS7 - FAQ - How does JobScheduler terminate Jobs.

A Subagent Cluster is considered functional as long as one Subagent remains for job execution.

Scalability

Subagents can be used for vertical and for horizonal scaling:

  • A single Subagent can execute > 15 000 tasks in parallel. There is no hard limit for the max. number of parallel tasks. A soft limit for the number of tasks can be specified per Subagent Cluster.
  • Users can shutdown and can restart Subagents at their will. When a Subagent is started then it is automatically considered for evaluation of metrics. For example, in a containerized environment 10 Subagents are configured for a Subagent Cluster with 2 Subagent containers running during normal hours and additional 8 Subagent containers being started at peak times.

Resources

Examples


  • No labels