Configuring Kyvos Manager High Availability for YARN-based deployments (On-prem)
Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace
Kyvos Azure Marketplace Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)
For the on-prem YARN mode cluster, you can manually configure Kyvos Manager high availability. This section explains the steps and procedures required for the same.
Manual backup of current Kyvos Manager instance
Creating a backup of Kyvos Manager involves creating a backup of the following three components. Each of these can be created as a separate tar file.
Component | When to Backup | Command for creating backup |
|---|---|---|
Kyvos Manager binaries (Complete folder kyvosmanage_war/kyvosmanager/) | Create a backup at the time of deployment and there onwards on each upgrade/Rollback/TLS/LDAP configuration change | tar -zcvf KM_Binaries.tar.gz <KM_Installation-Path>/kyvosmanager_war/kyvosmanager |
Kyvosmanagerdata kyvosmanagerdata folder (excluding database folder) | Create a backup on completion of each operation initiated from the Kyvos Manager portal. | tar -zcvf KM_Data.tar.gz kyvosmanagerdata --exclude=server/db NOTE: Execute this command on the Kyvosmanager data path |
Kyvos Manager Database Only the database folder inside the kyvosmanagerdata folder | This backup is created automatically at the following path: /user/engine_work/setup/binaries To view, go to HDFS2. Navigate to /user/engine_work/setup/binaries and check for km_db_snapshot.tar.gz | |
JRE folder | tar -zcvf KM_JRE.tar.gz <KM_Installation-Path>/kyvosmanager_war/jre |
Restoring Kyvos Manager
To manually restore the Kyvos Manager, perform the following steps.
Identify a new node for restoring Kyvos Manager (either by creating a new node or using an existing node, which should be ready in advance).
Important
In case you are using an existing node for restoration instead of creating a new node, any further Kyvos Manager-related failure will not be tolerated, and other services running on the existing Kyvos Manager Node will also be impacted.
This new Kyvos Manger node should have the same configuration and permissions in terms of:
User Account: New node should have the same user account and credentials as that of the current Kyvos Manager node.
Resources: Primarily disks, mount points, folders, and permission on folders (for Kyvos Manager installation path, Kyvos installation path, and QE semantic model local path if applicable).
NOTE: Take note of all these paths in advance, as they cannot be checked at the source Kyvos Manager node once it is shut down.Access to the Hadoop cluster
Firewall/port access rules
Restore the backups in order of binaries > kyvosmanagerdata > database on the same paths as they were on the original node.
On this new Kyvos Manager node, start the Kyvos Manager process using the same user account with which it was running on the previous Kyvos Manager node. Use the following steps.
Login to Kyvos Manager and navigate to the Settings page.
Change the Kyvos Manager server access IP/Port as per the new applicable value. This will ensure the IP/Port of the new KyvosManager node where the restored Kyvos Manager started is updated in all the components.
Check once if a restart of AppMaster is required after this activity considering the binary/conf pushed during the start of AppMaster may contain the information of the previous Kyvos Manager node.
Note
Features dependent on Kyvos Manager will not work once the current instance of Kyvos Manager goes down, and those capabilities will not work till the time Kyvos Manager is not up (Either Kyvos Manager is not started back on the same node or Kyvos Manager is not restored on some new node)
Services HA/failover
Postgres High Availability functionality
AppMaster Scheduled restart in YARN-based clusters
It is recommended to set up the Primary Postgres instance on a separate instance (it should not be the same as the Kyvos Manager instance).
Warning
If two instances of Kyvos Manage are using the same database or backup of the database, ensure that:
They are not started at the same time
No operation is performed on both at the same time.
Failure to ensure these may result in undesirable product behavior and an unpredictable state.
Before each upgrade, please ensure you have a state for rollback (extra backup of previous)
Roll Back:
Kyvos Manager > Setting Page > Reconfigure IP of latest Kyvos Manager Node
Service status: Stopped
Restart AppMaster
Note
Configuring the High Availability of the High Availability node is not supported.