Table of Contents |
---|
Display feature availability | ||
---|---|---|
|
Status | ||||
---|---|---|---|---|
|
Mode Of Operation
The SSH Session Management adds the possibility to end orphaned remote processes started by SSH jobs or orphaned JobScheduler tasks from SSH jobs.
Use Case
What happens if the connection to your remote host breaks while a script is still running? How can the JobScheduler job which started the remote script know about that?
What happens if a remote script has finished, but the JobScheduler task which started the process remotely cannot know about that? (e.g. because of a temporarily broken network connection).
The SSH Session Management provides a solution for dealing with that type of issues: it provides the possibility of configuring an additional job chain to check for orphaned processes on the remote host as well as check for orphaned tasks.
You can configure your existing SSH job chain to start the monitoring job chain, which then will monitor the task of your original job chain as well as the processes started on the remote host via SSH.
Anchor | ||||
---|---|---|---|---|
|
To configure your SSH job to To use the SSH Session Management you have to configure your SSH Job job and define a second jobchain cleanup job chain for the cleanup work.
The feature is available only with requires the use of the JSch implementation by JCraft. To configure your SSH Job See How To - Usage of the SSH Job (JobSchedulerSSHJob) with JCraft's JSch for more information about configuring your SSH job to use the JSch implementation, see How To - Usage of the SOSSSHJob2JSAdapter Job with JCrafts JSch .
The SSH Session Management checks and processes depending on the following conditions:will carry out one of the following actions after checking the remote processes and JobScheduler tasks:
- if a remote process is running and the JobScheduler Task task is still alive:
- do nothing to do, the cleanup jobchain job chain goes to a setback condition and waits for another start
- if a remote process is running but the JobScheduler Task task is not no longer available anymore:
- the cleanup job tries to end the process on the remote host
- if a remote process is not no longer available anymore but the JobScheduler Task task is still running:
- the cleanup job ends the Task task immediately
- if a remote process and the JobScheduler Task task are not available anymore
- do nothing to do, the cleanup job ends
Adjust the SSH Job to be monitored
Related Development Issues
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Configuration of the SSH job to be monitored
A number of additional parameters have to be added to the job configuration before the SSH job can be monitored:To be able to monitor the SSH Job you have to add some additional parameters to the job configuration.
runWithWatchdog
- format:
- boolean
- default value:
false
- description:
- If this parameter is set to true, the job itself is aware that it is monitored and generates an a new order for the second jobchain with job chain with all parameters of the job and the pid of the connected shell. If the parameter is set to false or is not present the SSH Session Management will not be processed.
cleanupJobchain
- format:
- Stringstring
- default value:
- empty
- description:
- This parameters defines the path to the configuration files of the jobchain job chain to use for the cleanup work.
...
- format:
<command> \${pid}
- default value:
kill -9 \${pid}
- description:
- The command to kill a process on the remote machine. The command depends on the OS of the remote host. If the command is not set, the cleanup Job job checks whether the remote host if it runs is running on a Linux or on a Windows system and uses the related default relevant appropriate commands.
The placeholder${pid}
will placeholder will be substituted by the cleanup job and . Note that the leading $ character has to be escaped with "\".
- The command to kill a process on the remote machine. The command depends on the OS of the remote host. If the command is not set, the cleanup Job job checks whether the remote host if it runs is running on a Linux or on a Windows system and uses the related default relevant appropriate commands.
ssh_job_terminate_pid_command
- format:
<command> \${pid}
- default value:
kill -15 \${pid}
- description:
- The command to terminate for terminating a process on the remote machine. The This command depends on the OS of the remote host. If the command is not set, the cleanup Job job checks whether the remote host if it runs is running on a Linux or Windows and on a Windows system and uses the related appropriate default commands.
The placeholder${pid}
will placeholder will be substituted by the cleanup job and . Note that the leading $ character has to be escaped with "\".
- The command to terminate for terminating a process on the remote machine. The This command depends on the OS of the remote host. If the command is not set, the cleanup Job job checks whether the remote host if it runs is running on a Linux or Windows and on a Windows system and uses the related appropriate default commands.
ssh_job_get_pid_command
- format:
<command>
- default value:
echo $$
- description:
- A command or script to write that writes the pid of the connected shell to stdout of the remote host.
ssh_job_get_child_processes_command
- format:
<command> \${pid}
- default value:
/bin/ps -ef | pgrep -P\${pid}
- description:
- The command or script determines the child processes of the given pid. The command or script depends on the OS of the remote host. If the command is not set, the default command for Linux is used.
The${pid}
placeholders will be substituted by the cleanup job. Note that the leading $ character has to be escaped with "\".
- The command or script determines the child processes of the given pid. The command or script depends on the OS of the remote host. If the command is not set, the default command for Linux is used.
ssh_job_get_active_processes_command
- format:
<command> \${pid} \${user}
- default value:
/bin/ps -ef | grep \${pid} | grep \${user} | grep -v grep
- description:
- The command or script to check checks if the process on the remote host is still running. The cleanup job expects an exitcode = 0 if the process is still running or other than 0 if the process is not available anymore. The command or script depends on the OS of the remote host. If If the command is not set, the cleanup Job checks the remote host if it runs on Linux or Windows and uses the related default commandsdefault command for Linux is used.
The placeholders${pid}
and${user}
placeholders will be substituted by the cleanup job and . Note that the leading $ character has to be escaped with "\". - command exampleExample commands:
/bin/ps -ef
- writes all running processes to stdout on the remote host
| grep \${pid}
- filters the result to show only results containing the given ${pid}
- filters the result to show only results containing the given ${pid}
| grep \${user}
- filters the result to show only results containing the given ${user}
| grep -v grep
- filters the grep command itself
- filters the grep command itself
- The command or script to check checks if the process on the remote host is still running. The cleanup job expects an exitcode = 0 if the process is still running or other than 0 if the process is not available anymore. The command or script depends on the OS of the remote host. If If the command is not set, the cleanup Job checks the remote host if it runs on Linux or Windows and uses the related default commandsdefault command for Linux is used.
Example: Configuration
...
with JOE
Job
...
Parameters
...
Example: XML Configuration
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<job order="yes" stop_on_error="false" title="Launch commands or executable files by SSH"> <description> <include file="jobs/SOSSSHJob2JSAdapterJobSchedulerSSHJob.xml"/> </description> <params> <param name="host" value="[HOST]"/> <param name="port" value="[SSHPORT]"/> <param name="user" value="[USERNAME]"/> <param name="password" value="[PASSWORD]"/> <param name="auth_method" value="password"/> <param name="command_script_file" value="[PATH_TO_SCRIPTFILE]\test_sleep_90s.sh"/> <param name="runWithWatchdog" value="true"/> <param name="cleanupJobchain" value="kill_jobs/remote_cleanup_test"/> <param name="ssh_job_kill_pid_command" value="kill -9 \${pid}"/> <param name="ssh_job_terminate_pid_command" value="kill -15 \${pid}"/> <param name="ssh_job_get_pid_command" value="echo $$"/> <param name="ssh_job_get_active_processes_command" value="/bin/ps -ef | grep \${pid} | grep \${user} | grep -v grep"/> </params> <script java_class="sos.scheduler.job.SOSSSHJob2JSAdapter" language="java"/> <run_time/> </job> |
Anchor |
---|
Configuration Of The Cleanup Jobchain
|
A cleanup job chain with two jobs has to be configured to To process the cleanup of the remote processes or the JobScheduler task a second Jobchain with two jobs has to be configured.
The cleanup Jobchain job chain consists of two Jobsjobs, one to read the pid of the connected shell from a temporary file on the remote host and one to check if the process or the JobScheduler Task task is still running and to end that . The temporary file will be generated automatically and deleted after processing. The second job also ends either the remote process or the JobScheduler Task respectivelytask as appropriate.
Job 1: The read-pid-from-temporary-file
...
job
The first job reads the pid from a the temporary file on the remote host. If your SSH Job job is configured as described above, a the temporary file is will have been created on automatically on the remote host.
Configure the Class of the Job in JOE like this:
After choosing the relevant Classname class name from the list you have to , configure a setback for the job. Depending on the checked condition as described above the The setback will be used to restart the job according to the conditions found by the job (described above).
Job 2: The check-and-kill
...
job
The second job checks:
- if the pid determined by the first job is still active
...
- if the JobScheduler
...
- task is still
...
- active
- kills the remote process or the JobScheduler task
depending on the conditions found. The actions taken are described in the Mode of Operation chapter above.
Configure the class of the job in JOE as follows Taskbased on the condition as desribed above. Configure the Class of the Job in JOE like this:
Example
...
: XML Configuration
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<job order="yes" stop_on_error="no" title="Launch read pid file command by SSH"> <description> <include file="jobs/SOSSSHReadPidFileJobJSAdapterSOSSSHReadPidFileJob.xml"/> </description> <script java_class="sos.scheduler.job.SOSSSHReadPidFileJobJSAdapter" language="java"/> <delay_order_after_setback delay="30" is_maximum="no" setback_count="1"/> <delay_order_after_setback delay="0" is_maximum="yes" setback_count="3"/> <run_time/> </job> |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<job order="yes" stop_on_error="no" title="Kills orphaned PIDs on the Remote Host for clean up by SSH"> <description> <include file="jobs/SOSSSHKillJobJSAdapterSOSSSHKillJob.xml"/> </description> <script java_class="sos.scheduler.job.SOSSSHKillJobJSAdapter" language="java"/> <delay_order_after_setback delay="30" is_maximum="no" setback_count="1"/> <delay_order_after_setback delay="0" is_maximum="yes" setback_count="3"/> <run_time/> </job> |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<job_chain orders_recoverable="yes" visible="yes"> <job_chain_node error_state="ERROR" job="readPidFile" next_state="CheckTaskAndRemoteProcessesAndKillIfNeeded" on_error="setback" state="ReadPidFile"/> <job_chain_node error_state="ERROR" job="CheckAndKill" next_state="SUCCESS" on_error="setback" state="CheckTaskAndRemoteProcessesAndKillIfNeeded"/> <job_chain_node state="ERROR"/> <job_chain_node state="SUCCESS"/> </job_chain> |
Change Management References
Jira | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|