Introduction
- Jobs are executed with JS7 Agents that handle termination of jobs.
- Shell Jobs and JVM Jobs are under Control of the Agent that terminates running jobs.
- Jobs implementing use of an SSH Client or use of the JS7 - JITL SSHJob cannot guarantee that a job's child processes are terminated as they are controlled by the remote SSHD server.
- Termination of jobs can be caused by users from the JOC Cockpit and can performed automatically if a job exceeds a given timeout.
- As a prerequisite for termination by JOC Cockpit the Controller has to be connected to JOC Cockpit and the Agent has to be accessible to the Controller.
- See - JS-1965Getting issue details... STATUS
Termination of Jobs
Jobs can be terminated in one of the following ways:
- The job is configured with a timeout setting: if job execution exceeds the timeout then the job will be killed by the Agent.
- Jobs can be killed by use of the GUI operation and and by use of the JS7 - REST Web Service API:
- The Cancel/Kill operation kills a running job and fails the order.
- The Suspend/Kill operation kills a running job and suspends the order.
- Failed and suspended orders can be resumed.
Terminating Jobs on Unix
In Unix environments jobs receive the following signals from the Agent:
- When a job should be killed then the Agent first sends a SIGTERM signal.
- This signal can be ignored or can be handled by a job. For shell scripts a
trap
can be defined to e.g. perform cleanup tasks such as disconnecting from a database or removing temporary files.
- This signal can be ignored or can be handled by a job. For shell scripts a
- The job configuration includes the Grace timeout setting:
- The Grace Timeout duration is applied after a SIGTERM signal (corresponding to
kill -15)
has been sent by the Agent. This allows the job to terminate on its own, for example after some cleanup is performed. - Should the job still run after the specified Grace Timeout duration then the Agent sends a SIGKILL signal (corresponding to
kill -9
) that aborts the OS process.
- The Grace Timeout duration is applied after a SIGTERM signal (corresponding to
The OS commands used by the Agent to send signals include:
Termination signals
Signal Command SIGTERM
/bin/kill <pid>
SIGKILL
/bin/kill -KILL <pid>
- If required for your Agent platform then the commands can be adjusted, see JS7 - Agent Configuration Items
Job scripts frequently spawn child processes that have to be killed accordingly to their parent process.
- By default the OS removes child processes if the parent process is killed. However, this mechanism is not applicable for all situations, depending on the way how child processes have been spawned.
- In order to more reliably kill child processes the Agent makes use of the
kill_task.sh
script from itsvar_<port>/work
directory.- This script identifies the process tree created by the job script and kills any available child processes.
- Download: kill_task.sh
- Though the Agent is platform independent it is evident that retrieval of a process tree does not necessarily use the same command (
ps
) and options for any Unixes.- The Agent therefore allows to specify an individual kill script from a command line option should the built-in
kill_task.sh
script not be applicable to your Unix platform, see JS7 - Agent Operation.
- The Agent therefore allows to specify an individual kill script from a command line option should the built-in
Use of Exit Traps
In a situation when a Shell Job script starts a background process and does not wait for termination of the child process but instead completes (with our without error), then the Agent cannot identify the running child process (as its parent process is gone). It is therefore recommended to add a trap to the shell script that is triggered on termination of the script - independently from the fact that the script terminates normally or with an error. This prevents the script from terminating immediately with child processes running. Instead in case of forced termination the script continues due to its trap waiting for child processes and the Agent executes the kill_task.sh
script that identifies the process of the Shell Job script and kills any child processes.
Download: jduExitTrap.json
#!/bin/bash # define trap for script completion trap 'JS7TrapOnExit' EXIT JS7TrapOnExit() { rc=$? echo "($(date +%T.%3N)) $(basename $0): JS7TrapOnExit: waiting for completion of child processes ..." wait exit $rc } # create three child processes sleep 100 & sleep 110 & sleep 120 & # this is what the script normally should do: # echo "waiting for completion of child processes" # wait echo "script completed"
Explanation:
- Line 4: defines the trap calling the
JS7TrapOnExit()
function in case of theEXIT
event.EXIT
is a summary for a number of signals that terminate a script, however, this is available for the bash shell only. For use with other shells users instead have to state the list of signals such asTERM
,INT
etc. - Line 6 - 12: implements the
JS7TrapOnExit()
function including thewait
command to wait for termination of child processes or otherwise to immediately continue.- The exit code returned from the trap is reported by the task log and order log.
- However, job execution will be considered failed independently from its the exit code value as the Cancel/Kill or Suspend/Kill operation was performed.
- Line 15-17: starts background processes.
- Line 21 a script normally should
wait
for child processes, however, if this cannot be guaranteed, for example ifset -e
is used to abort a script in case of error, then use of a trap is an appropriate measure.
Automation of Exit Traps
JS7 offers an option to apply traps such as from the above example to a number of Shell Job scripts via JS7 - Script Includes.
- The trap and the trap function are added to a Script Include like this:
- The Script Include is embedded into any Shell Job scripts from a single line similar to a shebang:
Terminating Jobs on Windows
For Windows environments the following applies when terminating jobs:
- The Agent makes use of the
kill_task.cmd
script that is available from itsvar_<port>/work
directory.- The script makes use of the
taskkill
command to kill the job's process and its children. - Download: kill_task.cmd
- The script makes use of the
- An individual kill script can be specified with a command line option on Agent startup, see JS7 - Agent Operation.