You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Introduction

The Controller Status Operation Script offered for Unix Shell can be applied to perform frequently used status operations on Controllers and Agents.

Use of a Controller Cluster or Agent Cluster is subject to the JS7 - License.

Controller Status Operation Script

OperationObject
terminate / restart

Standalone Controller

Controller Cluster

cancel / restart
check
switch-overController Cluster
appoint-nodes
confirm-loss
enable / disableStandalone Agent
reset
switch-overAgent Cluster
confirm-loss
reset
enable / disableSubagent
reset


The script is offered for download and can be applied for frequently used status operations:

  • The script is available for Linux and MacOS® using bash shell.
  • The script terminates with exit code 0 to signal successful execution, with exit code 1 for command line argument errors and with exit code 4 for non-recoverable errors. Exit code 3 signals that no matching objects have been found.
  • The script is intended as a baseline example for customization by JS7 users and by SOS within the scope of professional services. Examples make use of JS7 Release 2.7.2, bash 4.2, curl 7.29.0 and jq 1.6.0.

Prerequisites

The Script requires the curl utility and the jq utility to be available from the operating system. 

jq ships with the MIT license, see https://opensource.org/licenses/MIT.

Download

Download: operate-controller.sh

Usage

Invoking the script without arguments displays the usage clause:


Usage
Usage: operate-controller.sh [Command] [Options] [Switches]

  Commands:
    terminate           --controller-id [--controller-url] [--switch-over]
    restart             --controller-id [--controller-url] [--switch-over]
    cancel              --controller-id [--controller-url]
    cancel-restart      --controller-id [--controller-url]
    switch-over         --controller-id
    appoint-nodes       --controller-id
    confirm-loss        --controller-id
    check               --controller-id --controller-url
    enable-agent        --controller-id --agent-id
    disable-agent       --controller-id --agent-id
    reset-agent         --controller-id --agent-id [--force]
    switch-over-agent   --controller-id --agent-id
    confirm-loss-agent  --controller-id --agent-id
    enable-subagent     --controller-id --subagent-id
    disable-subagent    --controller-id --subagent-id
    reset-subagent      --controller-id --subagent-id [--force]

  Options:
    --url=<url>                        | required: JOC Cockpit URL
    --user=<account>                   | required: JOC Cockpit user account
    --password=<password>              | optional: JOC Cockpit password
    --ca-cert=<path>                   | optional: path to CA Certificate used for JOC Cockpit login
    --client-cert=<path>               | optional: path to Client Certificate used for login
    --client-key=<path>                | optional: path to Client Key used for login
    --timeout=<seconds>                | optional: timeout for request, default: 60
    --controller-id=<id>               | required: Controller ID
    --controller-url=<url>             | optional: Controller URL for connection test
    --agent-id=<id[,id]>               | optional: Agent IDs
    --subagent-id=<id[,id]>            | optional: Subagent ID
    --audit-message=<string>           | optional: audit log message
    --audit-time-spent=<number>        | optional: audit log time spent in minutes
    --audit-link=<url>                 | optional: audit log link
    --log-dir=<directory>              | optional: path to directory holding the script's log files

  Switches:
    -h | --help                        | displays usage
    -v | --verbose                     | displays verbose output, repeat to increase verbosity
    -p | --password                    | asks for password
    -o | --switch-over                 | switches over the active role to the standby instance
    -f | --force                       | forces reset on Agent
    --show-logs                        | shows log output if --log-dir is used
    --make-dirs                        | creates directories if they do not exist

see https://kb.sos-berlin.com/x/9YZvCQ

Commands

  • terminate
    • Allows to terminate a Controller instance. If a Controller Cluster is used then no fail-over will occur as normal termination is not considered a failure situation.
    • Users can apply the --switch-over switch to shift the active role in a Controller Cluster on termination of the active cluster member.
  • restart
    • Allows to restart a Controller instance. If a Controller Cluster is used then no fail-over will occur as normal termination is not considered a failure situation.
    • Users can apply the --switch-over switch to shift the active role in a Controller Cluster on termination of the active cluster member. After restart the Controller instance will take the standby role in a Controller Cluster.
  • cancel
    • Allows to cancel a Controller instance. The Controller will immediately disconnect from Agents, will not create a journal snapshot and will terminate. The command causes fail-over in a Controller Cluster.
  • cancel-restart
    • The command combines the operations to cancel and to restart a Controller instance. The command causes fail-over in a Controller Cluster.
    • After restart the Controller instance will take the standby role in a Controller Cluster. 
  • switch-over
    • Allows to shift the active role in a Controller Cluster.
  • appoint-nodes
    • The command can be used in case that a Controller Cluster will not be coupled on initial operation.

    • The command is automatically sent by JOC Cockpit to Controller instances after restart.
  • confirm-loss
    • The command can be used in a situation when the active JOC Cockpit Cluster Watch was not available at the point in time of failure of the active Controller Cluster member.
    • Users can confirm that the failed Controller Cluster member effectively is not running which allows the remaining Controller Cluster member to take the active role.
  • check
    • Tests the connection between JOC Cockpit and a Controller instance. The operation is available before a Controller is registered.
    • The --controller-url must be specified that is used from JOC Cockpit to connect to the Controller. 
  • enable-agent
    • Agents can be enabled after having been disabled, Enabled Agents are considered for job execution.

  • disable-agent
    • When Agents are disabled,. they are not considered for job execution. Running jobs can continue until completion.
  • reset-agent
    • When an Agent is reset then the Agent will terminate and will restart. Job processes running in the Agent will be forcibly terminated and orders will be set to the failed state. When a forced reset is performed, then the operation forces an Agent to be reiniitalized, to drop its journal and to be dedicated to the current Controller. Users are recommended to double-check if an Agent is not dedicated to a different Controller before using the --force switch.

  • switch-over-agent
    • Users can switch-over the active role between Director Agents in an Agent Cluster.

  • confirm-loss-agent
    • The command can be used in a situation when the active Controller Cluster Watch was not available at the point in time of failure of the active Director Agent in an Agent Cluster. Users can confirm that the failed Director Agent effectively is not running which allows the remaining Director Agent member to take the active role.

  • enable-subagent
    • Subagents can be enabled after having been disabled, Enabled Subgents are considered for job execution.
  • disable-subagent
    • When Subgents are disabled,. they are not considered for job execution. Running jobs can continue until completion. 
  • reset-subagent
    • When a Subagent is reset then the Subagent will terminate and will restart. Job processes running in the Subagent will be forcibly terminated and orders will be set to the failed state. When a forced reset is performed, then the operation forces a Subagent to be reiniitalized and to be dedicated to the current Agent Cluster. Users are recommended to double-check if a Subagent is not dedicated to a different Agent Cluster before using the --force switch.

Options

  • --url
  • --user
    • Specifies the user account for login to JOC Cockpit. If JS7 - Identity Services are available for Client authentication certificates that are specified with the --client-cert and --client-key options then their common name (CN) attribute has to match the user account.
    • If a user account is specified then a password can be specified using the --password option or interactive keyboard input can be prompted using the -p switch.
  • --password
    • Specifies the password used for the account specified with the --user option for login to JOC Cockpit.
    • Consider use of the -p switch offering a secure option for interactive keyboard input.
  • --ca-cert
    • Specifies the path to a file in PEM format that holds the Root CA Certificate and optionally Intermediate CA Certificates to verify HTTPS connections to JOC Cockpit.
  • --client-cert
    • Specifies the path to a file in PEM format that holds the Client Certificate if HTTPS mutual authentication is used..
  • --client-key
    • Specifies the path to a file in PEM format that holds the Client Private Key if HTTPS mutual authentication is used..
  • --timeout
    • Specifies the maximum duration for requests to the JS7 REST Web Service. Default: 60 seconds.
  • --controller-id
    • Specifies the identification of the Controller.
  • --controller-url
    • When used with the check command, specifies the protocol, host and optionally port of the Controller instance to which the connection is tested.
    • When using the terminate, restart, cancel and cancel-restart commands for a Controller Cluster, the Controller URL must be specified.
  • --agent-id
    • The Agent ID specifies a unique identifier for a Standalone Agent or Agent Cluster. Agents are identified from their Agent ID.
    • When used with the enable-agent and disable-agent commands more than one Agent ID can be specified separated by comma.
  • --subagent-id
    • The Subagent ID specifies a unique identifier for a Subagent in an Agent Cluster. Subagents are identified from their Subagent ID.
    • When used with the enable-subagent, disable-subagent and reset-subagent commands, the option specifies the related Subagent.
    • When used with the enable-subagent and disable-subagent commands more than one Subagent ID can be specified separated by comma.
  • --audit-message
    • Specifies a message that is made available to the Audit Log.
    • Specification of Audit Log messages can be enforced on a per user basis and for a JS7 environment.
  • --audit-time-spent
    • Specifies the time spent to perform an operation which is added to the Audit Log.
    • The option can be specified if the --audit-message option is used.
  • --audit-link
    • Specifies a link (URL) which is added to the Audit Log.
    • The option can be specified if the --audit-message option is used.
  • --log-dir
    • If a log directory is specified then the script will log information about processing steps to a log file in this directory.
    • File names are created according to the pattern: deploy-workflow.<yyyy>-<MM>-<dd>T<hh>-<mm>-<ss>.log
    • For example: deploy-workflow.2022-03-19T20-50-45.log

Switches

  • -h | --help
    • Displays usage.
  • -v | --verbose
    • Displays verbose log output that includes requests and responses with the JS7 REST Web Service.
    • When used twice as with -v -v then curl verbose output will be displayed.
  • -p | --password
    • Asks the user for interactive keyboard input of the password used for the account specified with the --user option..
    • The switch is used for secure interactive input as an alternative to use of the option --password=<password>.
  • -s | --switch-over
    • Specifies for terminate and restart commands to switch the active role in a Controller Cluster.
  • -f | --force
    • When used with the reset-agent command for a Standalone Agent or Cluster Agent, and when used with the reset-subagent command for a Subagent, the option specifies that the Agent will terminate, will drop its journal and will restart. When resetting an Agent, job processes running in the Agent will be forcibly terminated and orders will be set to the failed state.
    • The operation forces an Agent to be reiniitalized and to be dedicated to the current Controller or Agent Cluster in case of Subagents. Users are recommended to double-check that an Agent is not dedicated to a different Controller or Agent Cluster before using the switch.
  • --show-logs
    • Displays the log output created by the script if the --log-dir option is used.
  • --make-dirs
    • If directories are missing that are indicated with the --log-dir option then they will be created.

Exit Codes

  • 0: operation successful
  • 1: argument errors
  • 3: no objects found
  • 4: JS7 REST Web Service is not reachable or reports errors

Examples

The following examples illustrate typical use cases for status operations on Controller and Agents.

Terminating, Restarting, Cancelling Controllers

Termination and restart of a Controller instance are offered by a number of commands.

Example for Terminating, Restarting, Cancelling Standalone Controller
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# terminate Standalone Controller
./operate-controller.sh terminate ${request_options[@]}          

# restart Standalone Controller
./operate-controller.sh restart ${request_options[@]}

# cancel Standalone Controller
./operate-controller.sh cancel ${request_options[@]}

# cancel and restart Standalone Controller
./operate-controller.sh cancel-restart ${request_options[@]}


When terminating/restarting a member in a Controller Cluster then --controller-url option must be used to specify which Controller instance should be terminated/restarted.

Example for Terminating, Restarting, Cancelling Controller Cluster
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# terminate Controller Cluster instance
./operate-controller.sh terminate ${request_options[@]} --controller-url=http://localhost:4444

# restart Controller Cluster instance
./operate-controller.sh restart ${request_options[@]} --controller-url=http://localhost:4444

# cancel Controller Cluster instance
./operate-controller.sh cancel ${request_options[@]} --controller-url=http://localhost:4444

# cancel and restart Controller Cluster instance
./operate-controller.sh cancel-restart ${request_options[@]} --controller-url=http://localhost:4444

Switching-over, Appointing Nodes and Confirming Node Loss for Controller Cluster

Users can switch-over the active role in a Controller Cluster.

The appoint-nodes command is available in case that a Controller Cluster will not be coupled on initial operation.

The confirm-loss command can be used in a situation when the active JOC Cockpit Cluster Watch was not available at the point in time of failure of the active Controller Cluster member.

Switching-over, Appointing Nodes and Confirming Node Loss for Controller Cluster
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# switch-over active role in Controller Cluster
./operate-controller.sh switch-over ${request_options[@]}

# terminate Controller instance and switch-over Controller Cluster
./operate-controller.sh terminate ${request_options[@]} --controller-url=http://localhost:4444 --switch-over

# restart Controller instance and switch-over Controller Cluster
./operate-controller.sh restart ${request_options[@]} --controller-url=http://localhost:4444 --switch-over


# appoint nodes for Controller Cluster
./operate-controller.sh appoint-nodes ${request_options[@]}

# confirm node loss for Controller Cluster
./operate-controller.sh confirm-loss ${request_options[@]} 

Enabling, Disabling, Resetting Agents

When Agents are disabled,. they are not considered for job execution. 

When an Agent is reset then the Agent will terminate and will restart. Users are recommended to double-check if an Agent is not dedicated to a different Controller before using the --force switch.

Example for Enabling, Disabling, Resetting Standalone Agent
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# enable Standalone Agent
./operate-controller.sh enable-agent ${request_options[@]} --agent-id=StandaloneAgent

# disable Standalone Agent
./operate-controller.sh disable-agent ${request_options[@]} --agent-id=StandaloneAgent

# reset Standalone Agent
./operate-controller.sh reset-agent ${request_options[@]} --agent-id=StandaloneAgent

# reset/force Standalone Agent
./operate-controller.sh reset-agent ${request_options[@]} --agent-id=StandaloneAgent --force


For an Agent Cluster the reset-agent command is available. Enabling/disabling is performed at Suagent level.

When resetting an Agent Cluster then similar behavior applies as for Standalone Agents. Users should be aware that all Subagents in an Agent Cluster will be reset.

Example for Resetting Agent Cluster
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# reset Agent Cluster
./operate-controller.sh reset-agent ${request_options[@]}--agent-id=AgentCluster

# reset/force Agent Cluster
./operate-controller.sh reset-agent ${request_options[@]} --agent-id=AgentCluster --force

Switching-over and Confirming Node Loss for Agent Cluster

Users can switch-over the active role in an Agent Cluster.

The confirm-loss-agent command can be used in a situation when the active Controller Cluster Watch was not available at the point in time of failure of the active Director Agent in an Agent Cluster.

Example for Switching-over and Confirming Node Loss for Agent Cluster
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# switch-over active role in Agent Cluster
./operate-controller.sh switch-over-agent ${request_options[@]} --agent-id=AgentCluster

# confirm node loss for Agent Cluster
./operate-controller.sh confirm-loss-agent ${request_options[@]} --agent-id=AgentCluster

Enabling, Disabling and Resetting Subagents

When Subagents in an Agent Cluster are disabled,. they are not considered for job execution. 

When a Subagent is reset then the Subagent will terminate and will restart. Users are recommended to double-check if a Subagent is not dedicated to a different Agent Cluster before using the --force switch.

Example for Enabling, Disabling and Resetting Subagent
# common options for connection to JS7 REST API
request_options=(--url=http://localhost:4446 --user=root --password=root --controller-id=controller)

# enable Subagent in Agent Cluster
./operate-controller.sh enable-subagent ${request_options[@]} --subagent-id=Subagent_01

# disable Subagent in Agent Cluster
./operate-controller.sh disable-subagent ${request_options[@]} --subagent-id=Subagent_01

# reset Subagent in Agent Cluster
./operate-controller.sh reset-subagent ${request_options[@]} --subagent-id=Subagent_01

# reset/foce Subagent in Agent Cluster
./operate-controller.sh reset-subagent ${request_options[@]} --subagent-id=Subagent_01 --force

Resources



  • No labels