Page History
...
- Jobs are executed with JS7 Agents that handle termination of jobs.
- Shell Jobs and JVM Jobs are under Control of the Agent that terminates running jobs.
- Jobs implementing use of an SSH Client or use of the JS7 - JITL SSHJob cannot guarantee that a job's child processes are terminated as they are controlled by the remote SSHD server.
- Termination of jobs can be caused by users from the JOC Cockpit and can performed automatically if jobs exceed a job exceeds a given timeout.
- As a prerequisite for termination by JOC Cockpit the Controller has to be connected to JOC Cockpit and the Agent has to be accessible to the Controller.
- See
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1965
...
- When a job should be killed then the Agent first sends a
SIGTERM
signal.- This signal can be ignored or can be handled by a job. For shell scripts a
trap
can be defined to e.g. perform cleanup tasks such as disconnecting from a database or removing temporary files.
- This signal can be ignored or can be handled by a job. For shell scripts a
- The job configuration includes the Grace timeout setting:
- The Grace Timeout duration is applied after a
SIGTERM
signal (corresponding tokill -15)
has been sent by the Agent. This allows the job to terminate on its own, for example after some cleanup is performed.
- The Grace Timeout duration is applied after a
- Should the job still run after the specified Grace Timeout duration then the Agent sends a
SIGKILL
signal (corresponding tokill -9
) that aborts the OS process.
The OS commands used by the Agent to send signals include:
Termination signals
Signal Command SIGTERM
/bin/kill <pid>
SIGKILL
/bin/kill -KILL <pid>
- If required for your Agent platform then the commands can be adjusted, see JS7 - Agent Configuration Items.
Job scripts frequently spawn child processes that have to be killed accordingly to their parent process.
...
In a situation when a Shell Job script starts a background process and does not wait for termination of the child process but instead completes (with our without error), then the Agent cannot identify the running child process ( as its parent process is gone). It is therefore recommended to add a trap to the shell script that is triggered on termination of the script - independently from the fact that the script terminates normally or with an error. This prevents the script from terminating immediately with child processes running. Instead in case of forced termination the script continues due to its trap waiting for child processes and the Agent executes the kill_task.sh
script that identifies the process of the Shell Job script and kills any running child processes.
Download: jduExitTrap.json
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/usr/bin/env bash # define trap for script completion trap 'JS7TrapOnExit' EXIT JS7TrapOnExitJS7Trap() { rc=$? # wait for completion of child processes or let kill_task.sh clean up child processes echo "($(date +%T.%3N)) $(basename $0): JS7TrapOnExitJS7Trap for signal $1: waiting for completion of child processes ..." wait echo "($(date +%T.%3N)) $(basename $0): JS7Trap for signal $1: leaving trap, exit code $rc" exit $rc } # define trap for script completion trap 'JS7Trap EXIT' EXIT trap 'JS7Trap SIGTERM' SIGTERM trap 'JS7Trap SIGINT' SIGINT # create three child processes sleep 100 & sleep 110 & sleep 120 & # this is what the script normally should do: # echo "waiting for completion of child processes" # wait echo "script completed" |
Explanation:
- Line 4: defines the trap calling the
JS7TrapOnExit()
function in case of theEXIT
event.EXIT
is a summary for a number of signals that terminate a script, however, this is available for the bash shell only. For use with other shells users instead have to state the list of signals such asTERM
,INT
etc.Line 6 - 12: implements the JS7TrapOnExit3 - 11: implements theJS7Trap()
function including thewait
command to wait for termination of child processes or otherwise to immediately continue.- The exit code returned from the trap in case of script termination is reported by the task log and order log.
- However, job execution will be considered failed independently from its the the exit code value as the Cancel/Kill or Suspend/Kill operation was performed.
- Line 14-16: define traps calling the
JS7Trap()
function in case of receipt of the following signals:EXIT
is a summary for a number of signals that terminate a script, however, this is available for the bash shell only.SIGTERM
is the termination signal sent by the Agent if the Cancel/Kill or Suspend/Kill operation is invoked.SIGINT
is added in case that OS processes external to the JS7 Agent would send this signal, that usually corresponds to hitting Ctrl+C in a terminal session.
- Line 15-17: starts background processes.
- Line 21 a script normally should
wait
for child processes, however, if this cannot be guaranteed, for example ifset -e
is used to abort a script in case of error, then use of a trap is an appropriate measure. - The following sequence of actions is performed:
- The above job script does not wait for child processes and therefore terminates triggering the EXIT pseudo-signal. The trap function is executed and waits for child processes to be completed. During this period the task process for the job remains alive.
- If subsequently the Cancel/Kill or Suspend/Kill operation is invoked, then the Agent sends a
SIGTERM
signal that- interrupts the
wait
command in the currently executedJS7Trap()
function, - triggers once more execution of the
JS7Trap()
function and performs thewait
operation for child processes.
- interrupts the
- Having applied the Grace Timeout the Agent executes the
kill_task.sh
script that sends aSTOP
signal to the task process, kills any child processes and finally sends aSIGKILL
signal to abort the task process. - The crucial point is that the job script would not terminate with child processes running but remains active due to triggering of a trap that allows the Agent to kill any child processes from the process tree. If the task process for the job script terminates with child processes running then the Agent cannot identify the process tree and cannot kill child processes.
Automation of Exit Traps
JS7 offers an option to apply traps such as from the above example to a number of Shell Job scripts via JS7 - Script Includes.
...