Execution Engine Configuration
Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace
Kyvos Azure Marketplace Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)
MapReduce is a default execution engine for Hive. Kyvos also supports Spark for running queries on Hive. You can configure the execution engine in this area according to your requirements.
Note
From Kyvos 2025.3.2 onwards, Kyvos has added support for Apache Iceberg to enhance compatibility and performance when working with external data sources on AWS EMR with Spark.
The fields displayed in the following figure are displayed ONLY if you select the Spark option.
Note
In the case of Azure (Databricks) deployment, only the Spark Version parameter is available for selection, and other fields are not displayed.
To configure execution engine properties for the cluster:
Enter details as:
Area | Parameter/Field | Comments/Description |
|---|---|---|
| Execution Engine Name | Select the Execution engine from the list. |
| Deployment Mode | Select the yarn-cluster option in case your Spark deployment mode is YARN cluster; else, select the yarn-client option. |
Node and Authentication
| Spark Source Node | To use the Hive Source Node, select the Same As Name Node option. Else, select the Other Node option. |
Spark Node Host Name | If you selected the Other Node option above, enter your node IP here. | |
Use different user account for accessing Spark Node | Select the checkbox to use a user account other than the Hadoop Node authentication user for accessing the Spark node. | |
Paths and Version
| Spark Version | Select the Spark version from the list. |
Spark Home Directory | Provide Spark home directory. | |
Spark Library Path | Enter library files path for Spark. Refer to the Appendix for details. | |
Spark Configuration Path | Enter the configuration files path for Spark. Refer to the Appendix for details. | |
| Configure Iceberg | Click this link to configure Iceberg in Kyvos. |
Spark Parameters | Spark Parameters | Use this to add custom Spark parameters for your cluster. |
Click Validate Spark file paths. The system validates user authentication and connection for paths.
Note
The Validate Hive File Paths button is not displayed for the Azure (Databricks) environment.
Click the Save button from the top-right of the page to save your changes.
Post-Deployment Steps to Enable Iceberg in Kyvos
Kyvos has introduced support for Apache Iceberg to enhance compatibility and performance when working with external data sources on AWS EMR with Spark.
Previously, non-Iceberg semantic model tables registered in AWS Glue have limited capabilities, especially when used for scalable analytics and advanced metadata management. These limitations prevent Kyvos from fully leveraging features such as time travel, schema evolution, and efficient snapshot isolation, which are critical for modern data operations. To address this issue, existing datasets must be converted to the Iceberg table format. This support modifies the table structure to ensure compatibility with Iceberg standards. Once enabled, Kyvos can more effectively manage and query large-scale data with improved consistency and performance.
This strategic integration allows Kyvos to leverage Spark’s high-performance query engine alongside Iceberg’s robust features, delivering better query performance, data governance, and scalability for enterprise-scale analytics.
Prerequisites to enable Iceberg in Kyvos,
Perform the following steps before you enable Iceberg in Kyvos Manager.
After completing the deployment, clone EMR 6.15.0 and add the following configuration in the EMR settings:
{ "Classification": "iceberg-defaults", "Properties": { "iceberg.enabled": "true" } }Then, synchronize AWS (EMR) via Kyvos Manager to apply the updated EMR version and configuration.
Download required JAR Files from an EMR 6.15.0 cluster to your local machine:
/usr/share/aws/iceberg/lib/iceberg-spark-runtime-3.4_2.12-1.4.0-amzn-0.jar
/usr/share/aws/aws-java-sdk-v2/aws-sdk-java-bundle-2.20.160-SNAPSHOT.jar
NOTE: These JAR files will be required when enabling Iceberg through Kyvos Manager.
Create an S3 location. In your AWS S3 bucket, create a folder to serve as the Iceberg warehouse location.
Example:s3://kyvos-output-769691/iceberg_location/
Configuring Iceberg Support through Kyvos Manager
To configure Iceberg support, perform the following steps.
In the Path Version section, click the Configure Iceberg link. The Configure Iceberg dialog box is displayed.
Click the Enable Iceberg checkbox. The fields on the Configure Iceberg dialog will be editable.
To set the Warehouse Location, specify the S3 path as created above in this section.
For example, s3://kyvos-output-769691/iceberg_location/Provide a Catalog Name for the Iceberg catalog.
Provide a Catalog Name for the Iceberg catalog. For example, Catalog Name = iceberg_catalog
Create an S3 location. In your AWS S3 bucket, create a folder to serve as the Iceberg warehouse location.
For example: s3://kyvos-output-769691/iceberg_location/Upload the aws-sdk-java-bundle-2.20.160-SNAPSHOT.jar file.
Upload the Iceberg JAR file as ‘iceberg-spark-runtime-3.4_2.12-1.4.0-amzn-0.jar’ file.
Click Save to apply the configuration. This will enable the Iceberg Configuration in Kyvos.