Scope
The functions for terminating task processes by the JobScheduler Master and Universal Agent have been extended allow the use of SIGTERM on Unix servers in addition to SIGKILL and allows an orderly of termination of task processes to take place over a limited period of time.
Feature History
This feature has been implemented stepwise between Release 1.9.0 and 1.10.0 (see the table of issues below for more detailed information).
Issues
Support of this feature is subject to the following issues:
Use Case
- Welche Rolle ist für den Fall verantwortlich? Wer will etwas? Engineering, Operating, Business
- Was ist das Ziel? Was ist das Mittel? Was ist die Motivation?
The information contained in this article draws together detailed information contained in a range of issues and should primarily be of interest to persons in engineering and to a lesser extent persons in operating functions.
Implementation
- The use of both SIGTERM and SIGKILL on Unix servers has the following advantages:
The use of SIGTERM before SIGKILL means that there is a greater chance of data being saved after after the the kill command has been issued.
The SIGTERM signal can - in contrast with SIGKILL - be monitored i.e. a pre-/postprocessing Script can be carried out. This means that the ending of a task by the JobScheduler can be reacted to and the sudo user process itself can be ended.
- The post-processing methods implementation of SIGTERM allows post-processing methods such as
spooler_process_after
to complete within the timeout period
The time allowed between the SIGTERM and the SIGKILL can be specified in the command using the timeout attribute (the default is 15 sec) -
<kill_task … timeout=".."/>
- This feature can also be applied for:
- remote processes - i.e. processes started by SSH and those started by an agent,
- child processes started by a process running on an agent (JS-1468).
JobScheduler Commands
The following operations can be carried out from the JobScheduler Operating Center interface (JOC):
- Operation: kill immediately
- JOC sends
<kill_task immediately="yes"/>
- The process is killed immediately using the SIGKILL signal.
- JOC sends
- Operation: terminate with timeout
- JOC sends
<kill_task immediately="yes" timeout="15"/>
- The process receives a SIGTERM signal. Should that process not terminate within the specified timeout period then it will be killed with a SIGKILL signal.
- JOC sends
- Operation: terminate
- JOC sends
<kill_task immediately="yes" timeout="never"/>
- The respective process receives a SIGTERM signal. No monitoring of the termination of that process as in operation 2) is performed.
- The process receives a SIGTERM signal. Monitoring of the process termination as described in operation 2. is not carried out.
- JOC sends
Delimitation
- This feature is intended for Unix platforms that implement the SIGTERM and SIGKILL signals. It is not intended for Windows platforms for which exclusively the Kill Immediately command applies.
- When using traps then please consider that the process created by the
<shell>
element receives the signal. Subsequent scripts that are called within the<shell>
element will not receive the signal.
You could therefore:- configure traps directly within the
<shell>
element. The shell process will then receive and handle the signal. - configure traps in a shell script that is added by an
<include>
element instead of being stated within the<shell>
element. The included shell script will receive and handle the signal. - forward signals to subsequent shell scripts that are called within a
<shell>
element.
- configure traps directly within the
- This feature has been fully implemented on the Universal Agent. It has been implemented for classic JobScheduler Agents using TCP (JS-1420).
Workaround
A monitor (i.e. a pre-/postprocessing script) has to be configured for shell jobs that have a timeout set (JS-1463).
For example:Workaround for shell jobs with a timeout<job name="shell_with_javascript_monitor"> <script language="shell"> <![CDATA[ echo hello world! sleep 45 ]]> </script> <monitor name="process0" ordering="0"> <script language="java:javascript"> <![CDATA[ function spooler_process_before(){ return true; } ]]> </script> </monitor> <run_time /> </job>
Example
- Beispiel Code und Erläuterungen
- Code Beispiel via Confluence Code Macro
- Beispiel Objekte
- Ganze Objekte wie jobs, orders etc nur als Anhang
Download the Example
Description
This example contains a job that uses a sigterm trap to show the difference between the kill_task and terminate_task commands provided by JOC.
The job job_trap_sigterm.job.xml shows how to trap the terminate command provided by JOC.
- Start the job
- Terminate the task in JOC
- You will see the log message sigterm will be ignored
The task will continue
<?xml version="1.0" encoding="ISO-8859-1"?> <job title="test test"> <script language="shell"> <![CDATA[ trap 'echo sigterm will be ignored' 15 for i in 1 2 3 4 5 6 7 8 9 0 do date sleep 10 done sleep 60 ]]> </script> <run_time /> </job>
Additional resources
References
- Change Management References
- JIRA Issues
- Documentation
- XML Element in der Referenzdokumentation