File Watching
- The JobScheduler Universal Agent can be used to watch incoming files and to trigger a job start for each file.
- The mechanisms for starting job chains apply as stated in the section about File Watching
- Subsequent jobs can be executed on the JobScheduler Master or on any Agent involved.
- Any number of jobs can be executed in sequence or in parallel for incoming files.
- Incoming files can be removed or moved to a different location
- by any of the jobs involved or
- at the end of processing by a file order sink that is specified with the job chain.
- Should an incoming file be treated by jobs, e.g. parsed or otherwise accessed, then the file has to be transferred to the host on which the respective job is executed.
- Use the YADE JITL Jobs to transfer files between locations with a number of protocols such as FTP, FTPS, SFTP, HTTP, WebDAV etc.
- YADE can be executed on any host and can be configured for YADE Server-to-Server File Transfer.
Configuration
- File watching is configured by the JobScheduler Master. No configuration is required on the JobScheduler Agent.
- The JobScheduler Master holds the configuration of a job chain with a file order source that is assigned to the Agent.
The configuration for the examples below can be downloaded here: jua_file_watching.zip
File watching for job execution on the JobScheduler Master
Let's assume the following configuration for a job chain with a file order source:
Process Class agent_in_dmz.process_class.xml
<?xml version="1.0" encoding="ISO-8859-1"?> <process_class max_processes="10" remote_scheduler="http://dmzhost:4445"> </process_class>
Job Chain remote_files_local_processing
<?xml version="1.0" encoding="ISO-8859-1"?> <job_chain file_watching_process_class="agent_in_dmz"> <file_order_source directory="/srv/files/1/in"/> <job_chain_node state="start" job="job1" next_state="continue" error_state="error"/> <job_chain_node state="continue" job="job2" next_state="success" error_state="error"/> <file_order_sink state="success" move_to="/tmp/jobscheduler/file/success" remove="no"/> <file_order_sink state="error" move_to="/tmp/jobscheduler/file/error" remove="no"/> </job_chain>
Explanations
- A process class is configured in a separate file, e.g.
agent_in_dmz.process_class.xml,
as stated with the above sample. The process class specifies the protocol, host and port that the Agent is operated for. - The job chain references the above process class with the
file_watching_process_class
attribute that causes the subsequent configuration for file order sources and file order sinks to be applied to the respective JobScheduler Agent. - The
<file_order_source>
is configured as explained for a JobScheduler Master. - Subsequent jobs are executed on the JobScheduler Master
- The
<file_order_sink>
specifies incoming files to be moved to different directories on the JobScheduler Agent host depending on the execution result of the job chain.
File watching for job execution on the JobScheduler Agent
The following configuration applies to a job chain that specifies file watching and job execution on the Agent:
Job Chain remote_files_remote_processing
<?xml version="1.0" encoding="ISO-8859-1"?> <job_chain process_class="agent_in_dmz"> <file_order_source directory="/srv/files/2/in"/> <job_chain_node state="start" job="job1" next_state="continue" error_state="error"/> <job_chain_node state="continue" job="job2" next_state="success" error_state="error"/> <file_order_sink state="success" move_to="/tmp/jobscheduler/file/success" remove="no"/> <file_order_sink state="error" move_to="/tmp/jobscheduler/file/error" remove="no"/> </job_chain>
Explanations
- A process class is configured in a separate file, e.g.
agent_in_dmz.process_class.xml,
as stated with the above sample. The process class specifies the protocol, host and port that the Agent is operated for. - The job chain references the above process class that causes the subsequent configuration for file order sources, file order sinks and jobs to be applied to the respective JobScheduler Agent.
- The
<file_order_source>
is configured as explained for a JobScheduler Master and applies to the JobScheduler Agent. - Subsequent jobs are executed on the JobScheduler Agent.
- The
<file_order_sink>
specifies incoming files to be moved to different directories on the JobScheduler Agent host depending on the execution result of the job chain. - For more detailed explanations see - JS-1301Getting issue details... STATUS
File watching with file transfer for job execution on the JobScheduler Master
The following configuration applies to a job chain that specifies file watching on the Agent, transfer of incoming files and job execution on the Master:
Job Chain remote_files_local_transfer
<?xml version="1.0" encoding="ISO-8859-1"?> <job_chain file_watching_process_class="agent_in_dmz"> <file_order_source directory="/srv/files/3/in"/> <job_chain_node state="transfer" job="jade" next_state="success" error_state="error"/> <job_chain_node state="success"/> <file_order_sink state="error" move_to="/tmp/jobscheduler/file/error" remove="no"/> </job_chain>
The job for the file transfer is configured as follows:
File Transfer Job
<?xml version="1.0" encoding="ISO-8859-1"?> <job title="API Job for JobScheduler Advanced Data Exchange" order="yes" stop_on_error="no" name="jade"> <description > <include file="jobs/jadeJob.xml"/> </description> <params > <param name="operation" value="move"/> <param name="source_host" value="dmzhost"/> <param name="source_protocol" value="sftp"/> <param name="source_ssh_auth_method" value="password"/> <param name="source_user" value="foo"/> <param name="source_password" value="bar"/> <param name="target_dir" value="c:\temp"/> <param name="target_protocol" value="local"/> <param name="file_path" value="%scheduler_file_path%"/> </params> <script language="java" java_class="sos.scheduler.jade.JadeJob"/> <run_time /> </job>
Explanations
- A process class is configured in a separate file, e.g.
agent_in_dmz.process_class.xml,
as stated with the above sample. The process class specifies the protocol, host and port that the Agent is operated for. - The job chain references the above process class with the
file_watching_process_class
attribute that causes the subsequent configuration for file order sources to be applied to the respective JobScheduler Agent. - The
<file_order_source>
is configured as explained for a JobScheduler Master and applies to the JobScheduler Agent. - The YADE file transfer Job is executed on the JobScheduler Master
- It receives the triggered file, specified in the
file_path
parameter with%scheduler_file_path%
, from the Agent host.- At runtime
%scheduler_file_path%
is substituted with the actual path of the triggered file.
- At runtime
- Alternatively, a YADE job could be run on JobScheduler Universal Agent to send the file to the JobScheduler Master host.
- It receives the triggered file, specified in the
- If the transfer is successfull, a
file_order_sink
is not required, as the YADE Job is configured with themove
command which removes the input file from the agent host after transfer.