Page History
...
- Jobs are executed with JS7 Agents which handle termination of jobs.
- Shell Jobs and JVM Jobs are under control of the Agent which terminates running jobs.
- Jobs implementing use of an SSH Client cannot guarantee that a job's child processes are terminated as they are controlled by the remote SSHD server. The JS7 - JITL SSHJob provides the means to reliably kill terminate child processes.
- Termination of jobs can be caused triggered by users from the JOC Cockpit and can be performed automatically if jobs exceed a given timeout.
- As a prerequisite for termination by the JOC Cockpit, the Controller has to be connected to the JOC Cockpit and the Agent has to be accessible to the Controller.
...
- The job is configured with a timeout setting: if job execution exceeds the timeout then the job will be terminated by the Agent.
- Jobs can be terminated using the GUI operation operations and by use of the JS7 - REST Web Service API:
- The Cancel/Kill operation terminates a running job and fails the order.
- The Suspend/Kill operation terminates a running job and suspends the order.
- Failed and suspended orders can be resumed.
Terminating Jobs on Unix
In Unix environments, jobs receive the following signals from the Agent:
- When a job is to be terminated then the Agent first sends a
SIGTERM
signal.- This signal can be ignored or it can be handled by a job script. For shell jobs a
trap
can be defined to, for example, perform cleanup tasks such as disconnecting from a database or removing temporary files. - Note that this applies to job scripts that directly include shell code. If instead the job script includes calls to external shell scripts or programs then the Agent's
SIGTERM
signal is not forwarded to child processes running for external scripts or programs. To prevent this situation external shell scripts or programs can be called like this:exec /tmp/some_script.sh
- The
exec
command causes any external scripts or programs to be executed with the process of the current job script (instead of creating a new child process) and guarantees that theSIGTERM
signal is received by the process.
- This signal can be ignored or it can be handled by a job script. For shell jobs a
- The job configuration includes the Grace Timeout setting:
- The Grace Timeout duration is applied after a
SIGTERM
signal (corresponding to the commandkill -15)
has been sent by the Agent. This allows the job to terminate on its own, for example after some cleanup has been performed.
- The Grace Timeout duration is applied after a
- Should the job still be running after the specified Grace Timeout duration then the Agent will send a
SIGKILL
signal (corresponding to the commandkill -9
) that kills the OS process. - Note that it is essential recommended for job scripts that create child processes not to terminate on receipt of a
SIGTERM
signal before child processes are terminated.- Job scripts can use the
wait
command to wait for completion of child processes as this command prevents termination of the job script on receipt ofSIGTERM
. - Job scripts including any child processes will then be reliably killed by
SIGKILL
after the specified Grace Timeout.
- Job scripts can use the
The OS commands used by the Agent to send signals include:
Termination signals
Signal Command SIGTERM
/bin/kill <pid>
SIGKILL
/bin/kill -KILL <pid>
- If required for your Agent platform, the commands to send signals can be modified - see the JS7 - Agent Configuration Items article.
Job scripts frequently spawn child processes that have to be killed terminated in line with their parent process.
By default the OS kills child processes if the parent process is killed. However, this mechanism is not applicable for all situations, depending on the way child processes have been spawned.
Terminating Child Processes starting from Release 2.7.2
- When terminating a job process, the Agent performs the following steps:
- collect chld process PIDs of job process,
- send SIGTERM to job process,
- wait for one of the following events, whichever arrives first:
- wait for Grace Timeout configured with the job, JS7 - Job Instruction
- wait for stdout/stderr to be released by the job process .and child processes
- send SIGKILL signal to job process if Grace Timeout is exceeded,
- send SIGTERM signal to child processes for which PIDs have previously been collected; send SIGTERM recursively to child processes of a child process,
- wait for 50% of the duration of the Grace Timeout or for 1s whichever is the higher value,the delay specified with the --sigkill-delay option of the Agent Start Script, see JS7 - Agent Command Line Operation
- send SIGKILL signal to remaining child processes recursively.
- The Agent makes use of Java for process management.
- Users are free to use traps as explained with the below chapter. However, there is no need to add a trap to job scripts as the Agent by default will terminate child processes.
Terminating Child Processes starting from Release 2.1.1
- In order to more reliably kill child processes the Agent uses the
kill_task.sh
script from itsvar_<port>/work
directory.- This script identifies the process tree created by the job script and kills any available child processes.
- Download: kill_task.sh
- Though the Agent is platform independent it is evident that retrieval of a process tree does not necessarily use the same command (
ps
) and options for all Unixes.- The Agent therefore allows specification of an individual kill script from a command line option if the built-in
kill_task.sh
script is not applicable to your Unix platform, see JS7 - Agent Operation.
- The Agent therefore allows specification of an individual kill script from a command line option if the built-in
The OS commands used by the Agent to send signals include:
Termination signals
Signal Command SIGTERM
/bin/kill <pid>
SIGKILL
/bin/kill -KILL <pid>
- If required for your Agent platform, the commands to send signals can be modified - see the JS7 - Agent Configuration Items article.
Use of Exit Traps
The Short Version
You Users can add the following two traps to your their Shell Jobs:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/usr/bin/env bash trap "wait && exit 1431" TERM # 128+15 trap "rc=$? && wait && exit $?" EXIT |
For explanations see the long version.
The Long Version
In a situation when a Shell Job script starts a background process and does not wait for termination of the child process but instead completes (with or without error), then the Agent cannot identify the running child process as its parent process has gone. It is therefore recommended that a trap is added to the shell script. This will be triggered on termination of the script - independently of whether the script terminates normally or with an error. This prevents the script from terminating immediately while child processes are running. Instead, in the event of forced termination, the script will continue due to its trap waiting for child processes and the Agent will execute the kill_task.sh
script. This script identifies the Shell Job script process and kills the running child processes.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/usr/bin/env bash exec /tmp/some_script.sh |
Automation of Exit Traps
JS7 provides an option for applying traps such as those described in the example above. These can be applied to a number of Shell Job scripts via JS7 - Script Includes.
- The trap and the trap function are added to a Script Include like this:
- The Script Include is embedded into any Shell Job scripts from a single line similar to a shebang:
Terminating Jobs on Windows
Windows environments do not know about termination signals. When terminating a process then it will be killed immediately.
Terminating Child Processes starting from Release 2.7.2
- When terminating a job process, the Agent performs the following steps:
- collect chld child process PIDs of job process recursively,
- kill job process and any child processes recursively.
- The Agent makes use of Java for process management.
Terminating Child Processes starting from Release 2.1.1
- The Agent uses the
kill_task.cmd
script which is available from itsvar_<port>/work
directory.- The script uses the
taskkill
command to kill the job's process and its children. - Download: kill_task.cmd
- The script uses the
- An individual kill script can be specified with a command line option on Agent startup, see JS7 - Agent Operation.
...