Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Info

This article is not public info. In fact it is not required for users to implement individual traps. The below example is too specific for public use.

Instead, the generic solution provided by the JS7 - FAQ - Does JS7 reliably kill running jobs is sufficient for most users.

Introduction

  • Traps are used in shell jobs for the situation that a job should not be aborted immediately, but should be terminated after having performed a cleanup operation such as:
    • removing temporary files created by the job,
    • disconnecting from a database.
  • Traps are available for the Unix Shell, not for JVM Jobs and not for Windows Shell Jobs.
  • Display feature availability
    StartingFromRelease2.1.1

...

Code Block
languagebash
titleExample for a cleanup trap
linenumberstrue
#!/bin/bash

# define trap to forward receipt of the SIGTERM signal to a child process
trap 'kill -TERM $CHILD_PID' SIGTERM

# create shell script for background execution
TRAP_SCRIPT=$HOME/test-trap.sh
cat << 'EOF' > $TRAP_SCRIPT
#!/bin/bash

exitOnSigterm()
{
    exec &> /dev/tty
    local signal="$1"
    echo "($(date +%T.%3N)) $(basename $0): trap received signal \"$signal\", cleaning up..."
    if [[ "$CHILD_PID" != "" && -d /proc/$CHILD_PID ]]; then
        procInfo="$(ps -ef | /bin/grep -e PPID -e $CHILD_PID | /bin/grep -v /bin/grep)"
        echo -e "($(date +%T.%3N)) $(basename $0): trap found child process $CHILD_PID:\n$procInfo\n"
        cleanupTemporaryFiles "trap" "$CHILD_PID"
        # traps are not required to killterminate child processes: the Agent killsterminates child processes
        # /bin/kill -TERM "$CHILD_PID"
    fi
}

cleanupTemporaryFiles()
{
    exec &> /dev/tty
    tempFile="/tmp/temporary_file.$2"
    if [ -f "$tempFile" ]; then
        rm -f $tempFile
        echo "($(date +%T.%3N)) $(basename $0): cleanup performed by $1 for temporary file: $tempFile"
    fi
}

# add trap to call function on receipt of SIGTERM signal
trap 'exitOnSigterm "SIGTERM"' SIGTERM

# run sleep command in background and create a temporary file for later removal
sleep 120 &
CHILD_PID="$!"
touch /tmp/temporary_file.$CHILD_PID

# wait for completion of shell script or for execution of trap
wait "$CHILD_PID"

# cleanup is performed by trap or by shell script
cleanupTemporaryFiles "shell script" "$CHILD_PID"

exit
EOF

# run shell script in background
chmod +x $TRAP_SCRIPT
$TRAP_SCRIPT &

# wait for completion of shell script or for execution of trap
CHILD_PID="$!"
echo "waiting for completion of child process with pid $CHILD_PID"
wait "$CHILD_PID"

exit $?

...

In Unix environments jobs receive the following signals from the Agent:

  • When a job is to be killedterminated, the Agent sends a SIGTERM signal.
    • This signal can be ignored or can be handled by a job. For shell scripts a trap can be defined to, for example, perform cleanup tasks such as disconnecting from a database or removing temporary files.
  • The job configuration includes the Grace timeout setting:
    • The Grace Timeout duration is applied after a SIGTERM signal (corresponding to kill -15) has been sent by the Agent. This allows the job to terminate on its own, for example after a cleanup has been performed.
    • If the job is still running after the specified Grace Timeout duration then the Agent sends a SIGKILL signal (corresponding to kill -9) that aborts the OS process.

Job scripts frequently spawn child processes that have to be killed terminated in accord line with their parent process.

  • By default the OS removes terminates child processes if the parent process is killedterminated. However, this mechanism is not applicable for all situations, depending on the way the child processes have been spawned.
  • To reliably kill child processes the Agent makes use of the kill_task.sh script from its var_<port>/work directory.
    • This script retrieves the process tree of the job shell script and tries to kill any child processes.
  • Though the Agent is platform independent it is evident that retrieval of a process tree does not necessarily use the same command (ps) and options for all Unixes.
    • The Agent therefore allows an individual kill script to be specified from a command line option if the built-in kill_task.sh script is not applicable to your Unix platform.
  • For details see JS7 - FAQ - How does JobScheduler terminate Jobs.

Trap Operations on Termination

...

  • When the relevant OS signal is received then the current command of the job shell script is aborted cancelled and instead the trap is executed.
    • This is why you we find two trap definitions in the above example:
      • When cancelling a job then the SIGTERM signal is sent by the Agent to the process running the job shell script.
      • As the job shell script process spawns another shell script to be executed in background, the trap in line 4 of the example above is added:
        • the job shell script's trap forwards the SIGTERM signal to the sample shell script.
        • the sample shell script defines its own trap with line 36. This then receives the job shell script's signal.
  • After execution of the trap the sample shell script is resumed with the next command after line 44. 
    • The assumption is that the SIGTERM signal is received while waiting for the sleep command to be completed with line 44.
    • With the wait command being interrupted the cleanupTermporaryFiles() function is called by the sample shell script with line 47.
  • As a result the sample shell script is completed with line 49 and control is returned to the job shell script. This then The script continues with the line following the wait command in line 59 that basically exits the job shell script and provides the exit code of the most recently executed command.
    • This exit code is only informational as the Agent will set the job's exit code to the value 1 to indicate failure of the job independently of whether the trap has been completed successfully or not.

...

Resources