You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Scope

  • API jobs make use of the JobScheduler API Interface
  • Such jobs should be optimized considering the requirements and capabilities of the specific environment that they are operated for.
  • This article explains the impact of the quantity structure, resource consumption and parallelism on the overall performance.
  • The explanations in this article are intended to make users understand the factors that impact peformance. Performance optimization for individual environments is part of our Consulting Services

Performance Goals

  • Performance is about effective use of resources such as memory and CPU. Resources are shared across jobs. Therefore we can speed up the execution of certain processes but would have to accept a performance degradation for other processes.
  • Performance optimization requires a clear goal to be achieved. There is no such thing as an "overall speed improvement", instead performance improvements are oriented towards balanced use of resources for specific use cases.

Performance Factors

  • For starters let's define the exact meaning when using the following terms:
    • low number: 1 - 1000 objects
    • medium number: 1000 - 4000 objects
    • high number: 4000 - 10000 objects
  • Then consider the impact of a single instance and a distributed environment:
    • In a single instance environment all job related objects are managed and executed within the same instance (JobScheduler Master).
    • In a distributed environment all job related objects are managed by the same instance (JobScheduler Master), but are executed on distributed instances (JobScheduler Agents) at run-time.

Quantity Structure

The quantity structure is about the number of job related objects in use:

  • Number of jobs, job chains and orders
    • This is about the number of job-related objects that are available in the system, independently from the fact that they are running or not.
    • JobScheduler has to track events for jobs, e.g. when to start and to stop jobs. Therefore a high number of job related objects causes a performance impact. Some commonly used environments include up to 15000 jobs and 5000 job chains.
  • Number of job nodes
    • This is about the number of jobs that are used in job nodes for job chains.
      • Jobs can be re-used for any number of job chains. 
        • You could operate 1000 job chains with each using 5 job nodes with individual jobs which results in a total of 5000 jobs.
        • You could operate e.g. 100 different jobs that are organized in 1000 job chains each using a different sequence of 5 jobs.
      • The length of a job chain, i.e. the number of job nodes, is important:
        • In common environment job chains with up to 30 job nodes are used.
        • You can operate a job chain with a high number of e.g. 4000 job nodes. In fact this will have a slight effect on performance as JobScheduler has to check predecessor and successor job nodes. 

Resource Consumption

Consider what resources are consumed when operating jobs:

  • Resources used per job
    • Depending on the nature of your job this job will consume
      • an instance of a JVM
      • memory and CPU as required by your job implementation
  • Total of running jobs
    • This number has less impact on JobScheduler than you would expect, but it affects the available resources like memory and CPU.
  • x

Parallelism

  • x

Measures for Performance Optimization

Parallelism

Parallel Orders

  • When using <job_chain max_orders="number"> you will restrict the number of parallel orders to number.
  • By default this attribute is not effective which allows an unlimited number of parallel orders in a job chain.
  • When using a value <job_chain max_orders="1"> then this will result in strict serializaton of orders. The next order will enter the job chain only after the first order has completed.
  • Consider if your business requirements require orders to be serialized. Better performance is achieved by enabling orders to be processed in parallel.

Parallel Tasks

  • tasks=

Process Classes

Pre-loading of tasks

  • min_tasks=
  • idle_timeout=

 

 

 

 

  • No labels