What options are there for scaling KCM?
Kofax Communications Manager (KCM) consists of various components. When scaling up the underlying hardware architecture KCM can take advantage of the extra processing power if the user installs extra instances of certain KCM Components and configures them so that they allow parallel processing.
This whitepaper will describe 3 ways in which this is supported. For each of these we will describe how to enable and configure them, as well as the advantages and limitations of each. Some of these options require a 3rd party component or a change in the applications that drive KCM.
There is no particular order in which these 3 options have to be considered, as each has its own merits and weaknesses. They are however closely related to the architecture of KCM, of which we have depicted a simplified model below:
Looking at the picture above, from the bottom up, scaling can be applied on the Document Processor (DP) level, on the instance level, and on the contract manager level. We will describe each of these levels below, from the bottom up, as with each increased level additional KCM configuration will need to be considered.
Please note that this paper addresses scaling for performance only. It does not address replication of components to increase the robustness of a KCM system.
Document Processor Scaling
At the lowest level, additional Document Processors can be added to a KCM instance to take maximum advantage of the processing resources of a single server. This is often referred to as vertical scaling.
The advantage of adding additional Document Processors is that it is easy to implement. In KCM you can specify the desired number of Document Processors during installation, or you can use the Core Administrator application after installation to adjust the number of Document Processors of a KCM instance.
The main downside of this form of scaling is that it is bound by the physical processing resources of the server. In general it will make sense to have as many Document Processors as the server has processing cores. Beyond that, adding more Document Processors will not increase performance, and may actually decrease performance as an increased number of Document Processors is starting to compete for the limited resources of the server.
By default, the KCM package installer will install a number of Document Processors that depends on your KCM license. Please consider changing this number during installation or after. If you have a capacity based license the maximum number of Document Processors that you can run simultaneously is bound by your license.
After the possibilities of vertical scaling have become depleted, horizontal scaling should be considered. In KCM this is possible by installing additional instances on additional servers and register these at an existing contract manager. The contract manager and the instances will need to be configured in such a way that all instances are functionally equivalent, while the client application will need to be reconfigured so that it divides the work over this farm of instances. We will explain this below.
Realizing this form of performance scaling requires a number of steps:
- Each additional instance will need to be installed on its own server using a KCM predefined setup that excludes the contract manager.
- Each additional instance will need to be functionally equivalent to the existing instances in the processing farm. As each instance in KCM has its own repository database this means that the content of all repository databases in the farm must be synchronized. Also, if these instances depend on custom core scripting, this will need to be synchronized as well. Please see the section on instance synchronization below.
- Each instance will need to be registered at the common contract manager. The contract of each additional instance must be configured to support the same interfaces as the other instances in the instance farm. As a result of this, the contract manager will manage a collection of contracts, C1 to Cn that all offer exactly the same functionality.
- The client application will need to be configured to submit requests via one of these contracts, in such a way that work is distributed as evenly as possible.
Compared to the alternative solution that we will describe below, this solution has the following advantages:
- Only 1 contract manager needs to be configured, and this forms the single point of entry for KCM.
- No additional (3rd party) load balancer component is needed to distribute the work over all instances.
The main disadvantage of this scaling method however, is that the client application needs to be aware of the available contracts in the farm and it has to take care of the distribution of the work over these. Another potential disadvantage is that the single contract manager becomes a bottleneck.
The next section will describe an alternative horizontal scaling solution that does not have these disadvantages.
Contract Manager Scaling
As an alternative to scaling up the number of instances that a single contract manager handles, it is possible to create a processing farm of contract managers, each managing a single instance. This requires an additional (3rd party) component for dividing the work over all contract managers.
Setting this up requires the following:
- Each additional contract manager will need to be installed on its own server, together with a single instance. This means that each additional server basically is set up as a complete KCM installation using default installation options.
- Each instance will need to be functionally equivalent to the other ones in the farm. This requires the same form of synchronization as described in the section on Instance Scaling. The only difference is that the instances are not registered at a single common contract manager, but that each instance is registered at its own contract manager, which resides on the same server.
- Each contract manager in the farm will need to have the same contract configured for its instance.
- An additional (3rd party) load balancer component will need to be configured that divides requests over the available contract managers. This load balancer needs to be session-aware: when the load balancer has started a session on a particular contract manager, it needs to ensure that this session is completed at that contract manager.
- As each contract manager uses its own application key for controlling access, either the load balancer must be able to provide each with its own key, or this KCM feature must be turned off. In the latter case one must take care to restrict access to the contract managers by other means so that only the load balancer can access these.
Compared to the Instance Scaling solution this one has the advantage that the client application does not need to be aware of the farm. It can send all requests to the load balancer, and this will take care of distributing the load. In addition, this solution avoids that a single contract manager becomes a bottleneck.
The main disadvantage is that this solution not only requires synchronization of the instances, but also of all the contract managers involved, and that a (3rd party) load balancer will need to be deployed and configured. As there are now more entrances to KCM, securing access to KCM may also be more involved.
Synchronizing instances involves the synchronization of repository content, and possibly also the synchronization of bespoke core scripting functionality. We will discuss both below.
For full details on how to synchronise two instances, refer to this knowledge base article.
Repository Database Synchronization
KCM features Core Scripting commands that allow repository content to be exported and exported in a controlled manner. Using these commands it is possible to keep a farm of KCM instances synchronized in an automated way.
We recommend the following approach to update the content on a running KCM system:
- Separate releases of the content will be stored in the KCM repository as separate projects that are named in a unique manner that identifies their release. A simple way would be to make sure that each project name is extended with its release number.
- The client application requests jobs for a specific project as usual. Following the above release-based naming scheme the client application will request jobs for a specific release. We advise to design the client application in such a way that it is easy to switch from one release to another. After switching, the client application must use the new release for new jobs, while completing existing jobs with the old release.
- When upgrading the instance farm to a new release of the content, all instances will import the new project for that release, next to their existing projects. This can be done while the system is generating output based on earlier releases. It is important that the new release does not replace any existing project as this may result in a temporary inconsistent state for running jobs.
- Once the content of all instances has been upgraded, the client application can be switched to address the newly imported release for new jobs.
- After all jobs that still were based on the old release have ended the old release projects can be removed. This can be delayed for some time if it is deemed useful to switch back to this old release when the need arises.
Core Scripting Synchronization
Bespoke core scripting functionality can be synchronized using the ConfigureInstance management tool. Please note that activating new scripting functionality in an instance requires a restart of the Document Processors.
When scaling up we advise to start with the vertical scaling solution (Scaling up Document Processors), and if that is not sufficient to choose the most suitable of the two proposed horizontal scaling solutions.
This may raise the question whether it makes sense to combine both Horizontal scaling solutions so that one obtains a farm of Contract Managers, with each Contract Manager managing multiple instances. While this is architecturally viable, we advise against this, as this does not introduce performance gains over choosing one of the horizontal scaling solutions, while it combines most of their disadvantages.
Level of Complexity
|Kofax Communications Manager||all versions||n/a||n/a||n/a|