Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Links added

Table of Contents
outlinh1. true
outlinh1. true
1printablefalse
2stylh1. none
3indent20px

Description of the JobSchedulerExistsFile Job - check whether a file exists

Checks for the existence of a file, a directory or for specific files inside of a directory. 

The polling in above graphic is provided by a file order source setting in the job chain.

Example Job to Create a Result Set.

This job is creating a result set. A  A result set contain the names of all files which are selected as specified by the filter criteria. The content of the result set is returned as a parameter, but can be written to a file, too.
Parameters, which are useful for creating a result-set, are:

ParameterName Parameter Name

Title Description

raise_error_if_result_set_is

 Raise error on expected size of result-set

result_list_file 

Name of the result-list file

expected_size_of_result_set

 Number of expected hits in result-list

on_empty_result_set

 Set next node on empty result set

scheduler_sosfileoperations_resultset 

The result of the operation as a list of items

scheduler_sosfileoperations_resultsetsize 

The amount of hits in the result set of the operation

An example for a job-xml file:

Code Block

  <job order='no' >
     <params>
       <param name="[[#file|file]]" value="." />
       <param name="[[#file_spec|file_spec]]" value="" />
       <param name="[[#gracious|gracious]]" value="false" />
       <param name="[[#max_file_age|max_file_age]]" value="0" />
       <param name="[[#min_file_age|min_file_age]]" value="0" />
       <param name="[[#max_file_size|max_file_size]]" value="-1" />
       <param name="[[#min_file_size|min_file_size]]" value="-1" />
       <param name="[[#skip_first_files|skip_first_files]]" value="0" />
       <param name="[[#skip_last_files|skip_last_files]]" value="0" />
       <param name="[[#count_files|count_files]]" value="false" />
       <param name="[[#create_order|create_order]]" value="false" />
       <param name="[[#create_orders_for_all_files|create_orders_for_all_files]]file" value="false" />
       <param name="[[#order_jobchain_name|order_jobchain_name]]" value="" />
       <param name="[[#next_state|nextnext_state]]" value="" />
       <param name="[[#on_empty_result_set|on_empty_result_set]]" value="empty" />
       <param name="[[#expected_size_of_result_set|expected_size_of_result_set]]" value="0" />
       <param name="[[#raise_error_if_result_set_is|raise_error_if_result_set_is]]" value="0" />
       <param name="[[#result_list_file|result_list_file]]" value="empty" />
     </params>
     <script language="java" java_class="sos.scheduler.file.JobSchedulerExistsFile" />
  </job>
 

This job can be used standalone, as a single job, or as an order driven job in a jobchain as a jobchain node. Parameters are respectively accepted as job- or as order-parameters.
A job can process multiple parameters that are analysed analyzed when the job starts. Parameters are defined in the configuration of the job or of the order. Parameters can also be submitted by API methods, as well. Parameters are optional or mandatory and may contain default values.
This job is creating a result set. A result set contain the names of all files which are selected as specified by the filter criteria. The content of the result set is returned as a parameter, but can be written to a file, too. Parameters, which are useful for creating a result-set, are

ParameterName

Title

raise_error_if_result_set_is

 

result_list_file

 

expected_size_of_result_set

 

on_empty_result_set

 

scheduler_sosfileoperations_resultset

 

scheduler_sosfileoperations_resultsetsize

 

This job can create file-orders. It can be specified that a file-order is created for the first file of the result set only or for all files of the result set.

Parameter used by JobSchedulerExistsFile

...

 

Name

title

mandatory

default

file

File or Folder to watch for

true

.

file_spec

Regular Expression for filename filtering

false

 

gracious

Specify error message tolerance

false

false

max_file_age

maximum age of a file

false

0

min_file_age

minimum age of a file

false

0

max_file_size

maximum size of a file

false

-1

min_file_size

minimum size of one or multiple files

false

-1

skip_first_files

number of files to remove from the top of the result-set

false

0

skip_last_files

number of files to remove from the bottom of the result-set

false

0

count_files

Return the size of resultset

false

false

create_order

Activate file-order creation

false

false

create_orders_for_all_files

Create a file-order for every file in the result-list

false

false

order_jobchain_name

The name of the jobchain which belongs to the order

false

 

next_state

The first node to execute in a jobchain

false

 

on_empty_result_set

Set next node on empty result set

false

empty

expected_size_of_result_set

number of expected hits in result-list

false

0

raise_error_if_result_set_is

raise error on expected size of result-set

false

0

result_list_file

Name of the result-list file

false

empty

...

Parameter Definitions

Parameters Used by JobSchedulerExistsFile

 Name

Description

Mandatory

Default

file

File or Folder to watch for

true

.

file_spec

Regular Expression for filename filtering

false

 

gracious

Specify error message tolerance

false

false

max_file_age

Maximum age of a file

false

0

min_file_age

Minimum age of a file

false

0

max_file_size

Maximum size of a file

false

-1

min_file_size

Minimum size of one or multiple files

false

-1

skip_first_files

Number of files to remove from the top of the result-set

false

0

skip_last_files

Number of files to remove from the bottom of the result-set

false

0

count_files

Return the size of resultset

false

false

create_order

Activate file-order creation

false

false

create_orders_for_all_files

Create a file-order for every file in the result-list

false

false

create_orders_for_new_files

Create a file-order for every new file in the result-list 

falsefalse
param_name_file_pathThe name of the parameter that contains the name of the file to be transferredfalse---

order_jobchain_name

The name of the jobchain which belongs to the order

false

 

next_state

The first node to execute in a jobchain

false

 

merge_order_parameterMerge actual order parameter into new created orderfalsefalse

on_empty_result_set

Set next node on empty result set

false

empty

expected_size_of_result_set

Number of expected hits in result-list

false

0

raise_error_if_result_set_is

Raise error on expected size of result-set

false

0

result_list_file

Name of the result-list file

false

empty

check_steady_state_of_files

Check the completeness of a file (steady state)

falsefalse
steady_state_countMaximum Number of Checkpointsfalse30
check_steady_state_intervalTemporal distance between checkpointsfalse1

Anchor
file
file
Parameter file: File or Folder to watch for

...

File or Folder to watch for
Checked file or directory
Supports masks for substitution in the file name and directory name with format strings that are enclosed by {*} and {*} [and] . The following format strings are supported:

Code Block

 [date: date format ]  
 
 '''date format''' must be a valid Java data format string, e.g. '''yyyyMMddHHmmss''' , '''yyyy-MM-dd.HHmmss''' etc. 
 

An example:

Code Block

 <param name="file" value="sample/hello[date:yyyyMMdd].txt" />  
 

On 2050-12-31 the parameter file contains the value "sample/hello20501231.txt" .
This parameter supports substitution of job parameter names with their value if the job parameter name is enclosed by %  and  % .
An example: <param name="file" valuh1. value"%scheduler_file_path%" />
During the job runtime the parameter file contains the value of the job parameter scheduler_file_path . Using Directory Monitoring with File Orders the job parameter scheduler_file_path contains automatically the path of the file that triggered the order.
Data-Type : SOSOptionString
The default value for this parameter is ..
This parameter is mandatory.

...

Anchor
file_spec
file_spec
Parameter file_spec

...

: Regular Expression for filename filtering

...

Regular Expression for filename filtering
Regular Expression for file filtering. The behaviour is CASE_INSENSITIVE.
Only effective if the parameter file is a directory.
Some remarks on regular expression, as used in JobScheduler. :

  • A regular expression is not a wildcard . To get an impression of the differences one have a look on the meaning of the wildcard .txt, which will select all filenames with the filename-extension ".txt". A regular expression to match, e.g. works the same way, this "wildcard" must look like "^.\.txt$". That looks a little bit strange but it is much more flexible and powerfull on filtering filenames than the concept of wildcards, if one want to filter more complex names or pattern.
  • The general syntax of an regular expression , also referred to as regex or regexp, is described here . It is different to other RegExp definitions, e.g. as for Perl.

Data-Type : SOSOptionRegExp

Anchor
gracious
gracious
Parameter

...

gracious

...

: Specify error message tolerance

...

Specify error message tolerance
Enables or disables error messages that are caused by an empty result-set, which is the result of an operation, executed by the job. Therefore this parameter can control the sequence of nodes or states in a job-chain.
Valid values:

Code Block

 '''false, 0, off, no, n, nein, none''' , '''true, 1, on, yes, y, ja, j''' and '''all''' . 
 

The following rules apply when the result - set is empty:

GRACIOUS

Standalone - Job

Order - Job

false, 0, off, no, n, nein, none

error log, Task error

error log, set_state error

true, 1, on, yes, y, ja, j

no error log, Task success

no error log, set_state error

all

no error log, Task success

no error log, set_state success

For example, the setting "graciousallgracious=all" will suppress all errors regarding an empty result-set and will terminate a Job (standalone and inside a jobchain) as it would be without errors.
Data-Type : SOSOptionGracious
The default value for this parameter is false.

...

Anchor
max_file_age
max_file_age

...

Parameter max_file_

...

age:

...

Maximum age of a file

...

maximum age of a file
Specifies the maximum age of a file. If a file is older, then it is deemed not to exist, it will be not included in the result - list.
Data-Type : SOSOptionTime
The default value for this parameter is 0.

...

Anchor
min_file_age
min_file_age

...

Parameter min_file_

...

age:

...

Minimum age of a file

...

_minimum age of a file

_Specifies the minimum age of a files. If the file(s) is newer then it is classified as non-existing, it will be not included in the result - list.
Data-Type : SOSOptionTime
The default value for this parameter is 0.

...

Anchor
max_file_size
max_file_size

...

Parameter max_file_

...

size:

...

Maximum size of a file

...

_maximum size of a file

_Specifies the maximum size of a file in bytes: should the size of one of the files exceed this value, then it is classified as non-existing.
valid values for file size are

...

Data-Type : SOSOptionFileSize
The default value for this parameter is -1.

...

Anchor
min_file_size
min_file_size

...

Parameter min_file_

...

size:

...

Minimum size of one or multiple files

...

minimum size of one or multiple files
Specifies the minimum size of one or multiple files in bytes: should the size of one of the files fall below this value, then it is not included in the result list of the operation.
valid values for file size are

...

Data-Type : SOSOptionFileSize
The default value for this parameter is -1.

...

Anchor
skip_first_files

...

skip_first_files
Parameter skip_first_

...

files:

...

Number of files to remove from the top of the result-set

...

number of files to remove from the top of the result-set
The number of files are removed from the beginning of the set resulting by min_file_size , min_file_age etc. These files are excluded from further operations.
The result set is sorted according to the used filter parameters:

...

Only either skip_first_files or skip_last_files is allowed to be set at the same time.
Data-Type : SOSOptionInteger
The default value for this parameter is 0.

...

Anchor
skip_last_files
skip_last_files

...

Parameter skip_last_

...

files:

...

Number of files to remove from the bottom of the result-set

...

_number Number of files to remove from the bottom of the result-set

_The number of files are removed from the end of the set resulting by min_file_size, min_file_age etc. These files are excluded from further operations.

The result set is sorted according to the used constraining parameters used: +

  • min_file_age, max_file_age: in ascending order by date of last modification, the newest file first.

...

  • min_file_size, max_file_size: in ascending order by file size, the smallest file first.

+ if If parameters for file age as well as file size are given the set is sorted by file age.

Only either skip_first_files or skip_last_files is allowed to be set at one time.
Data-Type : SOSOptionInteger
The default value for this parameter is 0.

...

Anchor
count_files
count_files
Parameter count_files

...

: Return the size of resultset

...

Return the size of resultset
If this parameter is set true " true " the number of matches is returned in the order parameter " scheduler_SOSFileOperations_file_count ".
Valid values: true, 1, on, yes, y, ja, j and false, 0, off, no, n, nein
This parameter is valid and available for order driven jobs only. JobChains, for example, are order driven jobs. In standalone jobs this parameter will be ignored without further notice.
Data-Type : SOSOptionBoolean
The default value for this parameter is false.

...

Anchor
create_order

...

create_order
Parameter create_order: Activate file-order creation

...

Activate file-order creation
With this parameter it is possible to specify, that for all filenames in the resultlist or for the first file only (see create_orders_for_all_files ) a file-order has to be created and launched.
Valid values: true, 1, on, yes, y, ja, j and false, 0, off, no, n, nein
Data-Type : SOSOptionBoolean
The default value for this parameter is false.
Use together with parameter:
create_orders_for_all_files - Create a file-order for every file in the result-listorder_jobchain_name - next_state -

...

Anchor
create_orders_for_all_files
create_orders_for_all_files

...

Parameter create_orders_for_all_

...

files: Create a file-order for every file in the result-list

...

Create a file-order for every file in the result-list
Valid values: true, 1, on, yes, y, ja, j and false, 0, off, no, n, nein
Data-Type : SOSOptionBoolean
The default value for this parameter is false.
Use together with parameter:
create_order - Activate file-order creationorder_jobchain_name - next_state -

Anchor
create_orders_for_new_files

Parameter <span id"order_jobchain_name">order_jobchain_name<span>: The name of the jobchain which belongs to the order

The name of the jobchain which belongs to the order
The name of the jobchain which has to be launched by the order is the value of this parameter.
One must take into account, that the name of the jobchain must contain a subfolder structure if the jobchain is not in the folder "live". An example: the jobchain "Test" is located in "live/sample/FileOperations/". The value which has to be specfied is then "/sample/FileOperations/Test".
Data-Type : SOSOptionString
Use together with parameter:
create_order - Activate file-order creationnext_state -

Parameter <span ih1. "next_state">next_state<span>: The first node to execute in a jobchain

The first node to execute in a jobchain
The name of the node of a jobchain, with which the execution of the chain must be started, is the value of this parameter.
Data-Type : SOSOptionJobChainNode
Use together with parameter:
create_order - Activate file-order creationorder_jobchain_name -

Parameter <span id"on_empty_result_set">on_empty_result_set<span>: Set next node on empty result set

Set next node on empty result set
The next Node (Step, Job) to execute in a JobChain can be set with this parameter. The value of the parameter is a (valid) node-name of the current JobChain. In case of an empty result-set, e.g. due to non existent files, the current job will end without an errors and the JobChain will continue with the name of the node which is given as the value of this parameter.
Data-Type : SOSOptionJobChainNode
The default value for this parameter is empty.

Parameter <span ih1. "expected_size_of_result_set">expected_size_of_result_set<span>: number of expected hits in result-list

number of expected hits in result-list

Data-Type : SOSOptionInteger
The default value for this parameter is 0.
Use together with parameter:
raise_error_if_result_set_is - raise error on expected size of result-set

Parameter <span id"raise_error_if_result_set_is">raise_error_if_result_set_is<span>: raise error on expected size of result-set

raise error on expected size of result-set
With this parameter it is possible to raise an error if the quantity of hits of the result-list is according to the value of this parameter.
An example:
Assuming, that the parameter "raise_error_if_result_set_ih1. ne" is defined and the parameter "expected_size_of_result_set1" is specified as well. If the number of hits is not equal to "1" an error will raised.

Data-Type : SOSOptionRelOp
The default value for this parameter is 0.
Use together with parameter:
expected_size_of_result_set - number of expected hits in result-list

Parameter <span ih1. "result_list_file">result_list_file<span>: Name of the result-list file

Name of the result-list file
If the value of this parameter specifies a valid filename the result-list will be written to this file.
Data-Type : SOSOptionFileName
The default value for this parameter is empty.

Return parameter JobSchedulerExistsFile

create_orders_for_new_files
Parameter create_orders_for_new_files: Create a file-order for every new file in the result-list

...

Create a file-order for every new file in the result-list

If this parameter is set to "true", for each new file which is in the result set, a file-order is created and started.

This parameter is in effect only if the create_orders parameter is not set or has the value "true".

example 1: create a file-order

    create_orders_for_new_files=true

Valid values: true, 1, on, yes, y, ja, j and false, 0, off, no, n, nein.

DataType: SOSOptionBoolean

Default: false

Anchor
param_name_file_path
param_name_file_path
Parameter param_name_file_path: The name of the parameter containing the name of the file to be transferred

...

The name of the parameter containing the name of the file to be transferred

This parameter sets the name of the parameter that contains the name of the transferred file. The default value is scheduler_file_path. The name should be changed from the default if it is not desired to create file_orders that have to handle a file sink.

DataType: SOSOptionString

Default: ---

Anchor
order_jobchain_name
order_jobchain_name
Parameter order_jobchain_name: The name of the jobchain which belongs to the order

...

The name of the jobchain which belongs to the order
The name of the job chain which has to be launched by the order is the value of this parameter.
One must take into account, that the name of the jobchain must contain a subfolder structure if the jobchain is not in the folder "live". An example: the jobchain "Test" is located in "live/sample/FileOperations/". The value which has to be specfied is then "/sample/FileOperations/Test".
Data-Type : SOSOptionString
Use together with parameters:

Anchor
next_state
next_state
Parameter next_state: The first node to execute in a jobchain

...

The first node to execute in a jobchain
The name of the node of a jobchain, with which the execution of the chain must be started, is the value of this parameter.
Data-Type : SOSOptionJobChainNode
Use together with parameters:

Anchor
merge_order_parameter
merge_order_parameter
Parameter merge_order_parameter: Merge actual order parameter into new created order  

...

merge actual order parameter into new created order  
 

This parameter specifies that the order, which has to be created, will be extended by the parameters of the actual order.

DataType: SOSOptionBoolean
Default: false 

Anchor
on_empty_result_set
on_empty_result_set
Parameter on_empty_result_set: Set next node on empty result set

...

Set next node on empty result set
The next Node (Step, Job) to execute in a JobChain can be set with this parameter. The value of the parameter is a (valid) node-name of the current JobChain. In case of an empty result-set, e.g. due to non existent files, the current job will end without an errors and the JobChain will continue with the name of the node which is given as the value of this parameter.
Data-Type : SOSOptionJobChainNode
The default value for this parameter is empty.

Anchor
expected_size_of_result_set
expected_size_of_result_set
Parameter expected_size_of_result_set: Number of expected hits in result list

...

Number of expected hits in result-list

Data-Type : SOSOptionInteger
The default value for this parameter is 0.
Use together with parameter:

Anchor
raise_error_if_result_set_is
raise_error_if_result_set_is
Parameter raise_error_if_result_set_is: Raise error on expected size of result set

...

Raise error on expected size of result-set
With this parameter it is possible to raise an error if the quantity of hits of the result list is according to the value of this parameter.
An example:
Assuming, that the parameter "raise_error_if_result_set_is=ne" is defined and the parameter "expected_size_of_result_set=1" is specified as well. If the number of hits is not equal to "1" an error will raised.

Data-Type : SOSOptionRelOp
The default value for this parameter is 0.
Use together with parameter:

Anchor
result_list_file
result_list_file
Parameter result_list_file: Name of the result list file

...

Name of the result-list file
If the value of this parameter specifies a valid filename the result-list will be written to this file.
Data-Type : SOSOptionFileName
The default value for this parameter is empty.

Anchor
check_steady_state_of_files
check_steady_state_of_files
Parameter check_steady_state_of_files: Check the completeness of a file (steady state)

...

Check the completeness of a file (steady state) 

In some file transfer scenarios the receiver of a file has no knowledge about the time when the sender creates the file. In case of a (very) large file it can be the situation that the receiver tries to read the file but the sender has not finished writing it. If the receiver get the file at the moment the sender is still writing, as a result he will get a corrupted, incomplete file.

Setting this parameter to "true" the receiver will check the file for completeness before he starts the transfer.

At the end, this is not a very secure approach, because the receiver is checking the date of last modification and the size of the file. If both not changing between a time intervall, which is defined by the parameters ..., the file is guessed to be complete. If the sender is terminated without writing the complete file, or the network is down, or the speed of processing the file is going slow, the receiver will get a corrupted file.

A better approach for avoiding corrupt files is to use the atomic method: writing a file and after completion of writing rename the file. For more details about this method see parameter atomic_suffix or atomic_prefix.

If more than one file is to be transferred, the transactional approach is the first choice. See parameter transactional.

DataType: SOSOptionBoolean

Default: false

Anchor
steady_state_count
steady_state_count
Parameter steady_state_count: Maximum Number of Checkpoints

...

Maximum Number of Checkpoints 

The value of this option specifies the number of retries for to check the steady state of a file.

DataType: SOSOptionInteger

Default: 30

Anchor
check_steady_state_interval
check_steady_state_interval
Parameter check_steady_state_interval: Temporal distance between checkpoints

...

Temporal distance between checkpoints 

The value of this option defines the temporal distance in seconds between two checkpoints.

DataType: SOSOptionTime

Alias: Steady_State_Interval

Default: 1

Return Parameters from JobSchedulerExistsFile

The order parameters The order parameter described below are returned by the job to the JobScheduler. JobSchedulerExistsFile

...

...

 

Name

...

Title

...

Mandatory

...

...

File to process for a file-order

false

empty

scheduler_file_parent

...

Pathname of the file to process for a file-order

false

empty

scheduler_file_name

Name of the file to process for a file-order

false

empty

scheduler_sosfileoperations_resultset

The result of the operation as a list of items

false

empty

scheduler_sosfileoperations_resultsetsize

The amount of hits in the result set of the operation

false

empty

scheduler_sosfileoperations_file_count

Return the size of the result set after a file operation

false

0

...

scheduler_sosfileoperations_file_count

...

Return the size of the result set after a file operation

...

false

...

0

...

 

Anchor
scheduler_file_path
scheduler_file_path

...

Parameter scheduler_file_

...

path:

...

File to process for a file-order

...

file to process for a file-order
Using Directory Monitoring with File Orders the job parameter scheduler_file_path contains automatically the path of the file that triggered the order.
Data-Type : SOSOptionFileName
The default value for this parameter is empty.

...

Anchor
scheduler_file_parent

...

pathanme of the file to process for a file-order

...

...

scheduler_file_

...

parent
Parameter scheduler_file_

...

parent:

...

Pathname of the file to process for a file-order

...

Name Pathname of the file to process for a file-order

Data-Type : SOSOptionFileName
The default value for this parameter is empty.

...

Anchor
scheduler_file_name
scheduler_

...

file_name
Parameter scheduler_file_name: Name of the file to process for a file-order

...

Name of the file to process for a file-orderThe result of the operation as a list of items

Data-Type : SOSOptionstringSOSOptionFileName
The default value for this parameter is empty. Use together with parameter:

Anchor
scheduler_sosfileoperations_

...

resultset
scheduler_sosfileoperations_

...

resultset
Parameter

...

scheduler_sosfileoperations_

...

resultset: The

...

result

...

of the operation as a list of items

...

The amount of hits in the result set of the operation as a list of items

Data-Type : SOSOptionsIntegerSOSOptionstring
The default value for this parameter is empty.
Use together with parameter:

...

Anchor
scheduler_sosfileoperations_

...

resultsetsize
scheduler_sosfileoperations_

...

resultsetsize
Parameter scheduler_sosfileoperations_resultsetsize: The amount of hits in the result set of the operation

...

The amount of hits in the result set of the Return the size of the result set after a file operation

Data-Type : SOSOptionIntegerSOSOptionsInteger
The default value for this parameter is 0 empty.
Use together with parameter:

...

Anchor
scheduler_sosfileoperations_file_

...

count
scheduler_sosfileoperations_file_count

...

Parameter scheduler_sosfileoperations_file_

...

count: Return the size of the result set after a file operation

...

Return the size of the result set after a file operation

Data-Type : SOSOptionInteger
The default value for this parameter is 0.
Use together with parameter:

...

  -