Introduction
- Consider the information available from the JS7 - Impact of a Controller outage article.
- For information about the behavior in case of outages see JS7 - FAQ - What happens to workflows in case of outage of a Controller?
- Use of a Controller cluster will leverage an outage situation as the standby Controller instance will pick up operations immediately during fail-over after 3-5s. The failed active Controller instance can be started later on and will automatically synchronize with the currently active Controller instance.
Troubleshooting
The Controller is the component in JS7 that holds job-related configurations and orchestrates Agents. The outage of a Controller instance does not prevent the execution of workflows with jobs running on the same Agent. However, it affects for example the execution of workflows that include jobs running on a number of Agents as switching of Agents during workflow execution is performed by the Controller.
Troubleshooting starts from the fact that users reproduce and locate a problem in order to better know the nature of the problem:
- As a first step check the Controller's log file
controller.log
andwatchdog.log
, see JS7 - Log Files and Locations.- Warnings and errors can be found from the output qualifiers
WARN
andERROR
in a log file. - Example
2021-10-10T09:53:04,939 WARN js7.base.session.SessionApi - HttpControllerApi(https://apmacwin:4344): HTTP 401 Unauthorized: POST https://apmacwin:4344/controller/api/session => InvalidLogin: Login: unknown user or invalid password
- Warnings and errors can be found from the output qualifiers
- Due to log rotation log files of previous days are available in a compressed .tar.gz format on a daily basis, see JS7 - Log Rotation
- For Unix the
zcat
command can be used to directly access compressed log files. - For Windows compressed files have to be extracted, for example by use of 7-zip.
- For Unix the
- Consider that a Controller instance can report problems related to other components such as Agents and JOC Cockpit. In this situation it is recommended to check the component's log files.
- In case of warnings or error messages that are not evident users should do some research: the Product Knowledge Base and the Change Management System offer a search box, browsers offer access to search engines.
- Having completed analysis of a problem and being certain that the problem is related to a product defect and not to resources of the IT environment
- customers of a commercial license should use the Support Resources including the SOS ticket system.
- users of the open source license are invited to use Community Resources.
- Should the
controller.log
file not provide sufficient information to reproduce a problem then the log level should be increased, see JS7 - Log Levels and Debug Options.
In some situations, for example if computer memory is not sufficient for the heap size of the Controller instance's Java Virtual Machine, the outage of a Controller instance can be handled by restarting the instance. However, problems indicating insufficient resources typically require better sizing of resources.
If the problem is related to server resources and if operation of the Controller cannot be continued on the same server then relocation of the Controller instance can be a last means to fight an outage. Relocation includes to copy/move the Controller instance's JS7_CONTROLLER_DATA
/state
directory to a Controller instance on a new server. This directory holds the Controller instance's journal. To relocate a Controller instance the journal files should be copied to the new Controller instance. Refer to the JS7 - How to relocate a Controller article for the steps to apply.