Page History
...
- Users can limit health checks to clustered JS7 products. Shutdown of a Standalone Agent's host always has results in unavailability. Limiting health checks to clustered Agents using the
--agent-cluster
switch is recommended. - Users can improve performance
- by checking (and later patching) more than one host at the same time, using for example:
--whatif-shutdown=joc-2-0-primary,joc-2-0-secondary
. - by executing health checks for hosts in parallel.
- by checking (and later patching) more than one host at the same time, using for example:
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash # set common options for connection to the JS7 REST Web Service request_options=(--url=http://joc-2-0-primary.sos:7446 --user=root --password=root --ca-cert=./root-ca.crt --controller-id=controller --agent-cluster) # hosts to be patched hosts=(joc-2-0-primary joc-2-0-secondary controller-2-0-primary controller-2-0-secondary diragent-2-0-primary diragent-2-0-secondary) # max. number of tries in case of non-fatal problems tries=3 # delay in seconds between retries after non-fatal problems delay=1015 for host in "${hosts[@]}"; do echo "--------------------------------------------------------" echo "CHECKING IMPACT OF HOST SHUTDOWN: $host" echo "--------------------------------------------------------" try=1 while [ "$try" -le "$tries" ]; do echo "" echo "TRY $try/$tries: ./bin/operate-joc.sh health-check "${request_options[@]}" --whatif-shutdown=$host" echo "" ./bin/operate-joc.sh health-status "${request_options[@]}" --whatif-shutdown="$host" rc=$? echo -n "" case "$rc" in 0) break; ;; 3) sleep "$delay" ;; *) exit "$rc" ;; esac try=$((try+1)) done if [ "$rc" -eq 0 ] then echo "PATCH CAN BE APPLIED TO HOST: $host" # add your code for patching else echo "PATCH CANNOT BE APPLIED TO HOST: $host, Exit Code: $rc" # add your code for error handling fi echo "" done |
Explanations:
- Line 7: specifies the list of hostnames used by clustered JS7 products
- Line 10: specifies the maximum number of tries to perform the health-check. After reboot of a host it can take a few seconds until a cluster is re-established.
- Line 13: specifies the delay between tries. The value should be adjusted if it takes the cluster more time to recouple.
- Line 29-36: evaluates the health check result,
- exit code 0 signals an operational cluster,
- exit code 3 signals that the cluster is not (yet) functional,
- other exit codes signal component status errors, for example an unavailable Agent.
Overview
Content Tools