How to optimize the performance of API jobs?

Scope

API jobs make use of the JobScheduler API Interface
Such jobs can be optimized considering the requirements and capabilities of the environment that they are operated for.

Performance Goals

Performance is about effective use of resources. We can speed up the execution of certain processes but would have to accept a performance degradation for other processes.
Performance optimization requires a clear goal to be achieved. There is no such thing as an "overall speed improvement", instead performance improvements are oriented towards specific use cases.

Resource Consumption

For starters let's define the exact meaning when using the following terms:
- low number: 1 - 1000 objects
- medium number: 1000 - 4000 objects
- high number: 4000 - 10000 objects
Then let's consider the impact of a single instance and a distributed environment:
- In a single instance environment all job related objects are managed and executed within the same instance (JobScheduler Master).
- In a distributed environment all job related objects are managed by the same instance (JobScheduler Master), but are executed on distributed instances (JobScheduler Agents) at run-time.
In a first step let's check what resources are consumed when operating jobs:
- Number of job related objects
  - This is about the number of job-related objects, i.e. jobs, job chains and orders, that are available in the system, independently from the fact that they are running or not.
  - JobScheduler has to track events for jobs, e.g. when to start and to stop them. Therefore a high number of job related objects causes a performance impact. Some commonly used environments include up to 15000 jobs and 5000 job chains.
- Number of job nodes
  - This is about the number of jobs that are used in job nodes for job chains.
    - Jobs can be re-used for any number of job chains.
      - You could operate 1000 job chains with each using 5 job nodes with individual jobs which results in a total of 5000 jobs.
      - You could operate e.g. 100 different jobs that are organized in 1000 job chains each using a different sequence of 5 jobs.
    - The length of a job chain, i.e. the number of job nodes, is important:
      - In common environment job chains with up to 30 job nodes are used.
      - You can operate a job chain with a high number of e.g. 4000 job nodes. In fact this will have a slight effect on performance as JobScheduler has to check predecessor and successor job nodes.
- Resources used per job
  - Depending on the nature of your job this job will consume
    - an instance of a JVM
    - memory and CPU as required by your job implementation
- Total of running jobs
  - This number has less impact on JobScheduler than you would expect, but it affects the available resources like memory and CPU.
- x

Space shortcuts

Page tree

Scope

Performance Goals

Resource Consumption

Parallelism