How to optimize the performance of API jobs?

Scope

API jobs are job that make use of the JobScheduler API Interface
Such jobs can be optimized considering the requirements and capabilities of the specific environment they are operated for.
This article explains the impact of quantity structure, resource consumption and parallelism on the overall performance.
The explanations given in this article are intended to make users understand the factors that impact peformance and possible measures. Performance optimization for individual environments is part of the SOS Consulting Services.

Performance is about effective use of resources such as memory and CPU. Resources are shared across jobs, therefore you can speed up the execution of certain processes but will have to accept a performance degradation for other processes.
Performance optimization requires a clear goal to be achieved. There is no such thing as an "overall speed improvement", instead performance improvements are oriented towards balancing the use of resources for specific use cases.

For starters let's define the exact meaning when using the following terms:
- low number: 1 - 1000 objects
- medium number: 1000 - 4000 objects
- high number: 4000 - 10000 objects
Then consider the difference between a JobScheduler single instance environment and a distributed environment:
- In a single instance environment all job related objects are managed and executed within the same JobScheduler Master instance.
- In a distributed environment all job related objects are managed by the same JobScheduler Master instance, but are executed in distributed JobScheduler Agent instances at run-time, i.e. resource consumption is distributed across different servers.

The quantity structure is about the number of job related objects in use:

Number of jobs, job chains and orders
- This is about the number of job-related objects that are available in the system, independently from the fact that they are running or not.
- JobScheduler has to track events for jobs, e.g. when to start and to stop jobs. Therefore a high number of job related objects creates some performance impact. Common scenarios used in enterprise level environments include up to 20000 jobs and 10000 job chains in a single JobScheduler instance.
Number of job nodes
- This is about the number of jobs that are used in job nodes for job chains. Jobs can be re-used for any number of job chains.
  - You could operate e.g. 1000 job chains with each using 5 job nodes with individual jobs which results in a total of 5000 jobs.
  - You could operate e.g. 100 individual jobs that are used in 1000 job chains each using a individual sequence of 5 out the 100 jobs.
- The length of a job chain, i.e. the number of job nodes, is important:
  - In average scenarios job chains with up to 30 job nodes are used.
  - You can operate a single job chain with a medium number of e.g. 4000 job nodes. In fact this will have a slight effect on performance as JobScheduler has to check predecessor and successor nodes for each job node.

Consider what resources are consumed when running jobs:

Resources used per job
- Depending on the nature of your job it will consume
  - an instance of a JVM
  - memory and CPU as required by the job implementation
Total of running jobs
- This number has less impact on JobScheduler than you might expect, but it affects the available resources like memory and CPU.
- Consider the information from the article How to determine the sizing of a JobScheduler environment for memory and CPU consumption.
- A common observation is the fact that a system behaves performant as long it's capacity is not used up. Exceeding e.g. the memory limit of a server will result in the operating system swapping the memory and degrading the performance in an unacceptable way.

When using <job_chain max_orders="number"> you will restrict the number of parallel orders to number.
By default this attribute is not effective which allows an unlimited number of parallel orders in a job chain.
When using a value <job_chain max_orders="1"> then this will result in strict serializaton of orders. The next order will enter the job chain only after the first order has completed the final node of a job chain.
Consider if your business requirements force orders to be serialized. Better performance is achieved by enabling orders to be processed in parallel.