Page History
...
The Watchdog Script is available from the following location:
- Unix
<agent-home>/bin/agent_watchdog.sh
- Windows
<agent-home>\bin\agent_watchdog.cmd
Starting Agent / Restarting Agent
...
- For the command line operation:
agent.sh|.cmd
restart
- For
reset
andreset forced
operations on Agents that are available from the JOC Cockpit's Manage Controllers/Agents page.
Terminating Processes after Crash
In a situation when the Agent gets crashed, users might find a number of processes and related child processes running for jobs. Such processes continue to run which is undesired behavior as the outcome of jobs and execution results would not be known.
The Agent keeps track of processes and child processes created for jobs. The Watchdog Script will pick up this information and will proceed as follows:
- if a period is specified with the
--sigkill-delay
option of the Agent Start Script, see JS7 - Agent Command Line Operation,- send job processes and child processes the SIGTERM signal,
- wait for termination of job processes and child processes,
- send remaining processes and child processes the SIGKILL signal.
Display feature availability | ||
---|---|---|
|
Jira | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Logging
The Watchdog Script will capture output to the stdout/stderr channels through the lifetime of the Agent.
Log output is stored to the <agent-data>/logs/watchdog.log
file.
- The log file reports the command line used to start the Agent.
- The log file holds information about use of a JS7 - License.
- The log file is an important source for analysis in case of problems:
- Any warnings and errors that will not make it for Log4j logging are reported to the log file.
- The same applies to warnings and errors that occur before the JVM is initialized and before Log4j logging can start, for example if an incomptible Java version is used when starting the Agent.
Watchdog Operation
Users can check from the processes used for the Agent that both watchdog process and Agent process are running in parallel:
...
- The Agent cannot be restarted when operated for Unix or Windows.
- The
reset
andreset forced
operations on Agents available from the JOC Cockpit GUI cannot be performed.
Terminating Processes after Crash
...
- In case of Agent crash no job processes and related child processes
...
- will be terminated.
Besides the above effects the Agent will continue normal operation if the Watchdog process is not available
The Agent keeps track of processes and child processes created for jobs. The Watchdog Script will pick up this information and will
- send all processes and child processes created by the Agent the SIGTERM signal,
- wait for the period specified with the
--sigkill-delay
option of the Agent Start Script, see JS7 - Agent Command Line Operation. - send remaining processes and child processes the SIGKILL signal.
Display feature availability | ||
---|---|---|
|
Jira | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Logging
The Watchdog Script will capture output to the stdout/stderr channels through the lifetime of an Agent.
Log output is stored to the <agent-data>/logs/watchdog.log
file.
...
.