How to optimize the performance of API jobs?

Scope

API jobs are job that make use of the JobScheduler API Interface
Such jobs can be optimized considering the requirements and capabilities of the specific environment they are operated for.
This article explains the impact of quantity structure, resource consumption and parallelism on the overall performance.
The explanations given in this article are intended to make users understand the factors that impact peformance and possible measures. Performance optimization for individual environments is part of the SOS Consulting Services.

Performance Goals

Performance is about effective use of resources such as memory and CPU. Resources are shared across jobs, therefore you can speed up the execution of certain processes but will have to accept a performance degradation for other processes.
Performance optimization requires a clear goal to be achieved. There is no such thing as an "overall speed improvement", instead performance improvements are oriented towards balancing the use of resources for specific use cases.

Performance Factors

For starters let's define the exact meaning when using the following terms:
- low number: 1 - 1000 objects
- medium number: 1000 - 4000 objects
- high number: 4000 - 10000 objects
Then keep in mind the difference between a JobScheduler single instance environment and a distributed environment:
- In a single instance environment all job related objects are managed and executed within the same JobScheduler Master instance.
- In a distributed environment all job related objects are managed by the same JobScheduler Master instance, but are executed by distributed JobScheduler Agent instances at run-time, i.e. resource consumption is distributed across different servers.
Consider the key factors for performance that include quantity structure, resource consumption and concurrency.

Quantity Structure

The quantity structure is about the number of job related objects in use:

Number of jobs, job chains and orders
- This is about the number of job-related objects that are available in the system, independently from the fact that they are running or not.
- JobScheduler has to track events for jobs, e.g. when to start and to stop jobs. Therefore a high number of job related objects creates some performance impact. Common scenarios used in enterprise level environments include up to 20000 jobs and 10000 job chains in a single JobScheduler instance.
Number of job nodes
- This is about the number of jobs that are used in job nodes for job chains. Jobs can be re-used for any number of job chains.
  - You could operate e.g. 1000 job chains with each using 5 job nodes with individual jobs which results in a total of 5000 jobs.
  - You could operate e.g. 100 individual jobs that are used in 1000 job chains each using a individual sequence of 5 out the 100 jobs.
- The length of a job chain, i.e. the number of job nodes, is important:
  - In average scenarios job chains with up to 30 job nodes are used.
  - You can operate a single job chain with a medium number of e.g. 4000 job nodes. In fact this will have a slight effect on performance as JobScheduler has to check predecessor and successor nodes for each job node.

Resource Consumption

Consider what resources are consumed when running jobs:

System Resources are consumed depending on the nature of your jobs and include resources such as
- an instance of a JVM,
- storage, memory and CPU as required by the job implementation,
- individual resources such as access to objects in a database or file system.
.JobScheduler Resources are provided by the JobScheduler instance and are shared by jobs such as
- objects and methods of the JobScheduler API that are served by the JobScheduler Master,
- Locks that are accessed to prevent or restrict concurrent access to resources.

Concurrency

Concurrent access to resources has the potential to slow down performance. The key factors are:

Total of running jobs
- This number has less impact on JobScheduler than you might expect, but it affects the available resources like memory and CPU.
- Consider the information from the article How to determine the sizing of a JobScheduler environment for memory and CPU consumption.
- A common observation is the fact that a system behaves performant as long it's capacity is not used up. Exceeding e.g. the memory limit of a server will result in the operating system swapping the memory and will cause inacceptable performance penalties.
Race Conditions
- Multiple jobs accessing the same resources, e.g. shared Locks or objects in a database or file system tend to cause race conditions.
- Analyze the resources used by your jobs, e.g. use of exclusive vs. shared Locks or access to database tables to identify possible bottlenecks that would force serialized execution for processes that are assumed to run in parallel.

Measures for Performance Optimization

Parallelism

JobScheduler is designed for parallelism as the most effective means to improve performance.

Parallel Orders

When using <job_chain max_orders="number"> you will restrict the number of parallel orders in a job chain to number.
By default this attribute is not effective which allows an unlimited number of parallel orders in a job chain.
When using a value <job_chain max_orders="1"> then this will result in strict serializaton of orders. The next order will enter the job chain only after the first order has completed the final node of a job chain.
Consider if your business requirements force orders to be serialized. Better performance is achieved by enabling orders to be processed in parallel.

Parallel Tasks

tasks=

Process Classes

Pre-loading of tasks

min_tasks=
idle_timeout=

Performance Measurement

Measurement of Tasks vs. Orders
- The default behavior for jobs in a job chain is to continue the task for the respective job for another 5 seconds after the order passed the task.
- This behavior is intended as an optimization that allows the same task (system process) to be re-used for the next order entering the job node.
- Therefore it is pointless to measure the duration of individual tasks, but instead the time consumption of orders has to be considered, i.e. the time required to pass an individual job node or to complete the job chain.
Use of Profilers
- For Java API jobs the use of profilers is a proper means to check time consumption of an individual job execution.
- However, such tools are often unable to cope with the complexity of parallel processes in a system.
- Last but not least such tools cause an impact of their own on performance.
- Therefore we recommend to use profilers for frequency analysis of code in individual job implementations, but not for measurement of JobScheduler perforrmance.
Recommendations
- For performance measurement use the timestamps provided from the JobScheduler database for orders:
  - table SCHEDULER_ORDER_HISTORY: columns START_TIME and END_TIME provide the time required to complete the job chain
  - table SCHEDULER_ORDER_STEP_HISTORY: columns START_TIME and END_TIME provide the time required to complete the respective job node.
- In addition you can use our own logging by use of the spooler_log.info() method that is available from the JobScheduler API or by logging timestamps to individual files.

More information Consulting Services is available from the company web site.

Space shortcuts

Page tree