Target Architecture

...

Cockpit / Master / Agent

Master/Agent do not require a permanent connection
- each component works asynchroneously and independently
- a connection should be established at least once per day
Master
- control a number of Agents
- act as the central access point to Agents, e.g. for the GUI
- can be terminated and restarted during ongoing operations
- control the daily plan (calendar), what to run, when, where (but not how)
- operate as a singular instance
  - multiple Master instances are operated independent from each other
  - multiple Agent instances can be shared across a number of Master instances
- Agent
  - execute tasks from the daily plan independently
  - operate in a high-availability cluster

implement a fault-tolerant peer-to-peer network of Agents
accept daily plan from Master
available for active-passive and active-active clustering:
- fixed priority scheduling
- round-robin scheduling

execute job chains independently from Master availability
resolve more complex dependencies with Master, e.g. for checks of the job history or of external events from other machines
report job history and log information back to Master
run distributed recovery files for recovery purposes
allow access by a number of Master instances
provide resilience features for reconciliation after Master connection loss

...