Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Watchdog Script is provided for Unix and Windows. It is not used when operating the Agent from a Container or from a Windows Service.

The Watchdog Script is available from the following location:

  • Unix
    • <agent-home>/bin/agent_watchdog.sh
  • Windows
    • <agent-home>\bin\agent_watchdog.cmd

Starting Agent / Restarting Agent

...

  • For the command line operation: agent.sh|.cmd restart
  • For reset and reset forced operations on Agents that are available from the JOC Cockpit's Manage Controllers/Agents page.

Terminating Processes after Crash

In a situation when the Agent gets crashed, users might find a number of processes and related child processes running for jobs. Such processes continue to run which is undesired behavior as the outcome of jobs and execution results would not be known.

The Agent keeps track of processes and child processes created for jobs. The Watchdog Script will pick up this information and will proceed as follows:

  • if a period is specified with the --sigkill-delay option of the Agent Start Script, see JS7 - Agent Command Line Operation
    • send job processes and child processes the SIGTERM signal,
    • wait for termination of job processes and child processes,
  • send remaining processes and child processes the SIGKILL signal.

Display feature availability
StartingFromRelease2.7.2

Jira
serverSOS JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId6dc67751-9d67-34cd-985b-194a8cdc9602
keyJS-2148

Logging

The Watchdog Script will capture output to the stdout/stderr channels through the lifetime of the Agent.

Log output is stored to the <agent-data>/logs/watchdog.log file.

  • The log file reports the command line used to start the Agent.
  • The log file holds information about use of a JS7 - License.
  • The log file is an important source for analysis in case of problems:
    • Any warnings and errors that will not make it for Log4j logging are reported to the log file.
    • The same applies to warnings and errors that occur before the JVM is initialized and before Log4j logging can start, for example if an incomptible Java version is used when starting the Agent.

Watchdog Operation

Users can check from the processes used for the Agent that both watchdog process and Agent process are running in parallel:

...

  • The Agent cannot be restarted when operated for Unix or Windows.
  • The reset and reset forced operations on Agents available from the JOC Cockpit GUI cannot be performed.

Terminating Processes after Crash

...

  • In case of Agent crash no job processes and related child processes

...

  • will be terminated.

Besides the above effects the Agent will continue normal operation if the Watchdog process is not available

The Agent keeps track of processes and child processes created for jobs. The Watchdog Script will pick up this information and will

  • send all processes and child processes created by the Agent the SIGTERM signal,
  • wait for the period specified with the --sigkill-delay option of the Agent Start Script, see JS7 - Agent Command Line Operation.
  • send remaining processes and child processes the SIGKILL signal.

Display feature availability
StartingFromRelease2.7.2

Jira
serverSOS JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId6dc67751-9d67-34cd-985b-194a8cdc9602
keyJS-2148

Logging

The Watchdog Script will capture output to the stdout/stderr channels through the lifetime of an Agent.

Log output is stored to the <agent-data>/logs/watchdog.log file.

...

.