Introduction
- Traps are used in shell jobs for the situation that a job should not be aborted immediately, but should be terminated after having performed a cleanup operation such as:
- removing temporary files created by the job,
- disconnecting from a database.
- Traps are available for the Unix Shell, not for JVM Jobs and not for Windows Shell Jobs.
- FEATURE AVAILABILITY STARTING FROM RELEASE 2.1.1
Example
Download (.json upload): jduCleanupTrap.json
The implementation of a cleanup trap in a JS7 job script can look like this:
Example for a cleanup trap
#!/bin/bash # define trap to forward receipt of the SIGTERM signal to a child process trap 'kill -TERM $CHILD_PID' SIGTERM # create shell script for background execution TRAP_SCRIPT=$HOME/test-trap.sh cat << 'EOF' > $TRAP_SCRIPT #!/bin/bash exitOnSigterm() { exec &> /dev/tty local signal="$1" echo "($(date +%T.%3N)) $(basename $0): trap received signal \"$signal\", cleaning up..." if [[ "$CHILD_PID" != "" && -d /proc/$CHILD_PID ]]; then procInfo="$(ps -ef | /bin/grep -e PPID -e $CHILD_PID | /bin/grep -v /bin/grep)" echo -e "($(date +%T.%3N)) $(basename $0): trap found child process $CHILD_PID:\n$procInfo\n" cleanupTemporaryFiles "trap" "$CHILD_PID" # traps are not required to terminate child processes: the Agent terminates child processes # /bin/kill -TERM "$CHILD_PID" fi } cleanupTemporaryFiles() { exec &> /dev/tty tempFile="/tmp/temporary_file.$2" if [ -f "$tempFile" ]; then rm -f $tempFile echo "($(date +%T.%3N)) $(basename $0): cleanup performed by $1 for temporary file: $tempFile" fi } # add trap to call function on receipt of SIGTERM signal trap 'exitOnSigterm "SIGTERM"' SIGTERM # run sleep command in background and create a temporary file for later removal sleep 120 & CHILD_PID="$!" touch /tmp/temporary_file.$CHILD_PID # wait for completion of shell script or for execution of trap wait "$CHILD_PID" # cleanup is performed by trap or by shell script cleanupTemporaryFiles "shell script" "$CHILD_PID" exit EOF # run shell script in background chmod +x $TRAP_SCRIPT $TRAP_SCRIPT & # wait for completion of shell script or for execution of trap CHILD_PID="$!" echo "waiting for completion of child process with pid $CHILD_PID" wait "$CHILD_PID" exit $?
Explanation:
- Line 6 - 50: a sample shell script
$HOME/test-trap.sh
is created by the job shell script. This is started later on for background execution in line 54.- Line 11 - 23: the
exitOnSigterm()
function is defined. This is called if the trap is triggered - see line 36. - Line 25 - 33: the
cleanupTermporaryFiles()
function is defined that removes a temporary file that has previously been created by the job shell script.- This function is called by the
exitOnSigterm()
function of the trap in line 19. The intention is to perform a cleanup in case that a SIGTERM signal triggers the trap. - This function is called in line 47. It is intended for normal termination of the job shell script when no trap is triggered.
- This function is called by the
- Line 36: a
trap
is defined which is triggered by a SIGTERM signal and calls theexitOnSigterm()
function. - Line 39 - 41: the sample shell script starts a
sleep
command which is executed in the background and touches a temporary file which should be removed by thecleanupTermporaryFiles()
function. - Line 44: the sample shell script waits for termination of the
sleep
command. - Line 47: the
cleanupTermporaryFiles()
function is called to remove temporary files in case of normal termination without the trap being triggered.
- Line 11 - 23: the
- Line 53 - 54: the sample shell script is made executable and is started in background.
- Line 59: the job shell scripts waits for termination of the sample shell script.
Agent Operations for Termination
In Unix environments jobs receive the following signals from the Agent:
- When a job is to be terminated, the Agent sends a SIGTERM signal.
- This signal can be ignored or can be handled by a job. For shell scripts a
trap
can be defined to, for example, perform cleanup tasks such as disconnecting from a database or removing temporary files.
- This signal can be ignored or can be handled by a job. For shell scripts a
- The job configuration includes the Grace timeout setting:
- The Grace Timeout duration is applied after a SIGTERM signal (corresponding to
kill -15)
has been sent by the Agent. This allows the job to terminate on its own, for example after a cleanup has been performed. - If the job is still running after the specified Grace Timeout duration then the Agent sends a SIGKILL signal (corresponding to
kill -9
) that aborts the OS process.
- The Grace Timeout duration is applied after a SIGTERM signal (corresponding to
Job scripts frequently spawn child processes that have to be terminated in line with their parent process.
- By default the OS terminates child processes if the parent process is terminated. However, this mechanism is not applicable for all situations, depending on the way the child processes have been spawned.
- For details see JS7 - FAQ - How does JobScheduler terminate Jobs.
Trap Operations on Termination
It is important to keep in mind that a trap interrupts the currently executed command in a script, but does not terminate the script.
- When the relevant OS signal is received then the current command of the job shell script is cancelled and instead the trap is executed.
- This is why we find two trap definitions in the above example:
- When cancelling a job then the SIGTERM signal is sent by the Agent to the process running the job shell script.
- As the job shell script process spawns another shell script to be executed in background, the trap in line 4 of the example above is added:
- the job shell script's trap forwards the SIGTERM signal to the sample shell script.
- the sample shell script defines its own trap with line 36. This then receives the job shell script's signal.
- This is why we find two trap definitions in the above example:
- After execution of the trap the sample shell script is resumed with the next command after line 44.
- The assumption is that the SIGTERM signal is received while waiting for the
sleep
command to be completed with line 44. - With the
wait
command being interrupted thecleanupTermporaryFiles()
function is called by the sample shell script with line 47.
- The assumption is that the SIGTERM signal is received while waiting for the
- As a result the sample shell script is completed with line 49 and control is returned to the job shell script. The script continues with the line following the
wait
command in line 59 that basically exits the job shell script and provides the exit code of the most recently executed command.- This exit code is only informational as the Agent will set the job's exit code to the value
1
to indicate failure of the job independently of whether the trap has been completed successfully or not.
- This exit code is only informational as the Agent will set the job's exit code to the value
Resources
Overview
Content Tools