Purpose

The Cluster Service is used to operate a passive cluster of JOC Cockpit instances and to perform fail-over and switch-over operations between cluster members.
All other JS7 - Services depend on the Cluster Service to start them for the active JOC Cockpit instance.

Fail-over

Any JOC Cockpit instances which are running and connected to the same JS7 - Database implement a passive cluster. The active cluster member runs the Cluster Service, any passive cluster members watch for the active cluster members' ongoing operation by checking its heartbeats. In case of failure of the active cluster member one of the passive cluster members will become active and will start its Cluster Service.

Switch-over

The switch-over operation is technically similar to the fail-over operation, however, switch-over is caused by the GUI or by the API and allows the Cluster Service to normally stop any background services which are running.

Configuration

Location

The "Settings" are built-in and cannot be modified by the JOC Cockpit GUI

Configuration Items

Section	Setting	Default	Required	Purpose
cluster	`heart_beat_exceeded_interval`	60	no	The duration in seconds to signal that heartbeats from an active cluster member to the database did not arrive in time and that fail-over to the next passive cluster member should occur.
	`polling_interval`	30	no	The interval in seconds between sending heartbeats to the database.
	`polling_wait_interval_on_error`	2	no	The interval in seconds to continue polling after an error has occurred, e.g. due to transactional concurrency etc.
	`switch_member_wait_counter_on_success`	10	no	The number of retries to wait for the answer from the last active cluster member after its deactivation/activation.
	`switch_member_wait_interval_on_success`	5	no	The maximum number of seconds to wait for a cluster member to become active. max wait time = switch_member_wait_counter_on_success * switch_member_wait_interval_on_success + execution time
	`switch_member_wait_counter_on_error`	10	no	The maximum number of retries in case of errors, e.g. due to transactional concurrency etc., to switch the cluster to a different member.
	`switch_member_wait_interval_on_error`	2	no	The maximum number of seconds to wait for a cluster member to become active after an error. max wait time = switch_member_wait_counter_on_error*switchMemberWaitIntervalOnError+ execution time
	`current_is_cluster_member`	true	no	Enable cluster to switch to this instance.

Logging

The Cluster Service logs general messages, warnings and errors in the joc.log file.
More detailed information is additionally logged in the Main Log service-cluster.log file.
In addition to the Main Log, detailed debug information is logged in the Debug Log service-cluster-debug.log file.
For details see the JS7 - Log Files and Locations article.

Space shortcuts

Page tree

Purpose

Fail-over

Switch-over

Configuration

Location

Configuration Items

Logging

Space shortcuts

Page tree

JS7 - Cluster Service

Purpose

Fail-over

Switch-over

Configuration

Location

Configuration Items

Logging