Configuring an LLM connection

Configuring an LLM connection

LLM Connection is the configuration that enables a system to communicate with a Large Language Model (LLM), such as those developed by OpenAI, Azure OpenAI, AWS Bedrock, and Vertex AI, so on. An LLM connection is used to:

  • Translate natural language into analytical queries.

  • Power chat-based or voice-based interfaces.

  • Enhance user experience through intelligent, language-based interaction with data.

An LLM connection allows:

  • Sending natural language inputs (e.g., user queries) to the LLM.

  • Receiving generated outputs like MDX/SQL queries, summaries, or explanations.

  • Enabling features such as conversational analytics, natural language querying, or smart recommendations.

From Kyvos 2025.10 onwards, you can add AWS Bedrock as a supported provider for LLM and embedding connections. You can create and manage Bedrock-based connections, as with other supported providers such as Azure OpenAI and OpenAI.

Configuring a connection

To configure the GenAI LLM connection, perform the following steps: 

  1. From the Toolbox, click Connections.

  2. From the Actions menu (  ) click Add Connection.

  3. Select the name of the GenAI provider from the Provide list. The system will use the provider to generate output.

  4. To create an LLM connection, enter details as:

  5. After you finish configuring the settings using the table shown below the screenshot, click Save.

OpenAI

Parameter/Field

Description

Parameter/Field

Description

Name

A unique name that identifies your GenAI connections.

Category

Select the required category from the list.

Provider

Select the OpenAI provider from the list. The system will this provider to generate output.

URL

The LLM Service URL of the provider-specific endpoint for generating output.

API EndPoint

Specify which endpoint to be used to generate AI-powered conversational responses.

Authentication Type

Select Authentication Type as Authentication Key or O Auth 2.0.

  • When you select Authentication key, enter a unique key for authenticating and authorizing requests to the provider's endpoint.
    Note: If the key is not specified, the last provided Authentication Key will be used. To change, enter an Authentication Key.

  • When you select the OAuth 2.0 option, complete the following fields for authenticating and authorizing requests to the provider's endpoint.

    • Token URL: The token URL is needed to generate or fresh bearer token.

    • Client ID: Provide the client ID of the User-Assigned Managed Identity for authenticating and authorizing requests to the provider's endpoint.

    • Secret ID: Provide the secret ID for authenticating and authorizing requests to the provider’s endpoint.

Model

The name of the GenAI LLM model used to generate the output.

Is Model Fine Tuned

Select one of the following:
Yes: Select this option to fine-tune the model.
No: Select this option if you do not want to fine-tune the model.

Usage

Select one of the following for which this connection can generate output:

  • MDX Generations: Select this option to use the connection for MDX calculations or MDX queries.

  • Natural Language Querying: Select this option to use the connection for natural language querying.

  • Natural Language Summary: Select this option to use the connection for natural language summary.

  • Metadata Generation (Beta): Select this option to use the connection for generating metadata.

Note: When you select the required usage, you can make the selected usage as a default connection. The checkbox will be displayed to set the selected usage as a default connection.
For example, if you select Natural Language Querying option from the Usage list, the Default Connection for Natural Language Querying checkbox will be displayed.

Allow Sending Data for LLM

Select Yes or No to specify whether the generated questions should include values or not.

Max Data Points for Summary

Enter the value to configure maximum number of data points involved in generating the summary.

NOTE: The default value is 5000.

Input Prompt Token Limit

Specify the maximum tokens allowed for a prompt in a single request for the current provider.

NOTE: The default value is 16000.

The minimum value is 0.

Output Prompt Token Limit

Specify the maximum number of tokens shared between the prompt and output, which varies by model. One token is approximately four characters for English text.

NOTE: The default value is 2048.

The minimum value is 0.

Max Retry Count

Specify maximum retry count attempted so that we get correct query.

NOTE: The default value is 2.

Summary Records Threshold

Specify similarity threshold for query autocorrection.

NOTE: The default value is 0.1

The minimum value is 2.

LLM Temperature

Specify the LLM temperature, which controls the level of randomness in the output. Lowering the temperature results in less random completions. The responses of the model become increasingly deterministic and repetitive as it approaches zero. It is recommended to adjust either the temperature or top-p, but not both simultaneously.

Is Reasoning Model

Specify whether model used is reasoning model.

Properties

Click Properties to view or set properties.

Azure OpenAI

Parameter/Field

Description

Parameter/Field

Description

Connection Name

A unique name that identifies your GenAI connections.

Category

Select the required category from the list.

Provider

Select the Azure OpenAI provider from the list. The system will this provider to generate output.

URL

The LLM Service URL of the provider-specific endpoint for generating output.

API EndPoint

Specify which endpoint to be used to generate AI-powered conversational responses.

Authentication Type

You can select Authentication key, Client ID, Bearer Token, and OAuth 2.0. Based on your selection, the following will be displayed.

  • Authentication Key: Enter a unique key for authenticating and authorizing requests to the provider's endpoint.

  • Client ID: Provide the client ID of the User-Assigned Managed Identity for authenticating and authorizing requests to the provider's endpoint.

  • Bearer Token: Provide an access token used to securely authenticate and authorize requests to the Azure OpenAI Service.
    NOTE: When using Bearer Token authentication, connections for MDX generation are not supported.

  • OAuth 2.0: Provide the Token URL, Client ID, Secret ID for authenticating and authorizing requests to the provider's endpoint.

Model

The name of the GenAI LLM model used to generate the output.

Is Model Fine Tuned

Select one of the following:
Yes: Select this option to fine-tune the model.
No: Select this option if you do not want to fine-tune the model.

Embedding Connection

Specify the name of the GenAI embedding provider that the system will use to generate embeddings.

Usage

Select one of the following for which this connection can generate output:

  • MDX Generations: Select this option to use the connection for MDX calculations or MDX queries.

  • Natural Language Querying: Select this option to use the connection for natural language querying.

  • Natural Language Summary: Select this option to use the connection for natural language summary.

  • Metadata Generation (Beta): Select this option to use the connection for generating metadata.

Note: When you select the required usage, you can make the selected usage as a default connection. The checkbox will be displayed to set the selected usage as a default connection.
For example, if you select Natural Language Querying option from the Usage list, the Default Connection for Natural Language Querying checkbox will be displayed.

Allow Sending Data for LLM

Select Yes or No to specify whether the generated questions should include values or not.

Max Data Points for Summary

Enter the value to configure maximum number of data points involved in generating the summary.

NOTE: The default value is 5000.

Input Prompt Token Limit

Specify the maximum tokens allowed for a prompt in a single request for the current provider.

NOTE: The default value is 16000.

The minimum value is 0.

Output Prompt Token Limit

Specify the maximum number of tokens shared between the prompt and output, which varies by model. One token is approximately four characters for English text.

NOTE: The default value is 2048.

The minimum value is 0.

Max Retry Count

Specify maximum retry count attempted so that we get correct query.

NOTE: The default value is 2.

Summary Records Threshold

Specify similarity threshold for query autocorrection.

NOTE: The default value is 0.1

The minimum value is 2.

LLM Temperature

Specify the LLM temperature, which controls the level of randomness in the output. Lowering the temperature results in less random completions. The responses of the model become increasingly deterministic and repetitive as it approaches zero. It is recommended to adjust either the temperature or top-p, but not both simultaneously.

Top P

This property manages through diversity through nucleus sampling. Setting it to 0.5indicates that half of all likelihood-weighted option will be considered. It is recommended to adjust either this parameter or the temperature parameter, but not both simultaneously.

Frequency Penalty

This property specifies a number between -2.0 and 2.0. Positive values penalize new tokens based on their frequency in the existing text. This reduces the likelihood of the model repeating the same line verbatim.

Presence Penalty

This property specifies a number between -2.0 and 2.0. Positive values penalize new tokens based on their appearance in the text so far, thereby increasing the model's likelihood of discussing new topics.

Is Reasoning Model

Specify whether model used is reasoning model.

Properties

Click Properties to view or set properties.

AWS Bedrock

Parameter/Field

Description

Parameter/Field

Description

Connection Name

A unique name that identifies your GenAI connections.

Category

Select the LLM from the list.

Provider

Select the AWS Bedrock provider from the list. The system will this provider to generate output.

Access ID

Enter the access ID for accessing AWS services.

Secret ID

Enter the secret ID for accessing AWS services.

AWS Region

Enter the AWS region where the model is present.

Model

The name of the GenAI LLM model used to generate the output.

Is Model Fine Tuned

Select one of the following:
Yes: Select this option to fine-tune the model.
No: Select this option if you do not want to fine-tune the model.

Usage

Select one of the following for which this connection can generate output:

  • MDX Generations: Select this option to use the connection for MDX calculations or MDX queries.

  • Natural Language Querying: Select this option to use the connection for natural language querying.

  • Natural Language Summary: Select this option to use the connection for natural language summary.

  • Metadata Generation (Beta): Select this option to use the connection for generating metadata.

Note: When you select the required usage, you can make the selected usage as a default connection. The checkbox will be displayed to set the selected usage as a default connection.
For example, if you select Natural Language Querying option from the Usage list, the Default Connection for Natural Language Querying checkbox will be displayed.

Allow Sending Data for LLM

Select Yes or No to specify whether the generated questions should include values or not.

Max Data Points for Summary

Enter the value to configure maximum number of data points involved in generating the summary.

NOTE: The default value is 5000.

Input Prompt Token Limit

Specify the maximum tokens allowed for a prompt in a single request for the current provider.

NOTE: The default value is 16000.

The minimum value is 0.

Output Prompt Token Limit

Specify the maximum number of tokens shared between the prompt and output, which varies by model. One token is approximately four characters for English text.

NOTE: The default value is 2048.

The minimum value is 0.

Max Retry Count

Specify maximum retry count attempted so that we get correct query.

NOTE: The default value is 2.

Summary Records Threshold

Specify similarity threshold for query autocorrection.

NOTE: The default value is 0.1

The minimum value is 2.

LLM Temperature

Specify the LLM temperature, which controls the level of randomness in the output. Lowering the temperature results in less random completions. The responses of the model become increasingly deterministic and repetitive as it approaches zero. It is recommended to adjust either the temperature or top-p, but not both simultaneously.

Vertex AI

The Vertex AI connection enables integration with Google’s Vertex AI models for natural language querying, MDX generation, or summarization.

  1. Enter Unique name identifier for the connection.

  2. Select the required LLM Category.

  3. Enter details as:

Parameter

Description

Parameter

Description

Authentication Type

Select one of the following Authentication Type:

  • ADC: Uses Application Default Credentials available in the environment.

  • Service Account JSON: Uses a service account JSON file.
    Note: Service Account JSON is only supported for on prem clusters whereas ADC and Service account JSON. Both are supported on GCP Environment.

Service Account File Name

This property specifies the exact file name of the Service Account key file, including the .json extension (for example, my-project-sa-key.json), placed in the olapengine/bin directory.

Location

Specifies the Google Cloud region (e.g., us-central1) where Vertex AI resources are hosted and processed.

Project ID

Specifies the unique identifier for your Google Cloud Project where Vertex AI is enabled.

Candidate count

Specifies the number of independent, alternative responses (or "candidates") the model should generate for a single prompt. The default candidate count is 1.

Model

Select the model name. For example, Gemini-2.5 Pro and 3 Pro.

Thinking budget

For 2.5 Pro

Specifies the maximum number of tokens the model can use for its internal reasoning process to improve answer quality and complexity. If the selected model is 'gemini-2.5 Pro', the thinking budget must be set to a minimum of 128.

Thinking Level

For 3 Pro

This property controls the depth and complexity of the model's internal reasoning, ranging from a lower value for simple, quick answers to a higher value for more complex problems requiring deeper thought and logical steps. Increasing the thinking level may improve answer quality for challenging queries but will likely increase latency and token usage.

Is model fine tuned

Select one of the following:
Yes: Select this option to fine-tune the model.
No: Select this option if you do not want to fine-tune the model.

Usage

Select one of the following for which this connection can generate output:

  • MDX Generations: Select this option to use the connection for MDX calculations or MDX queries.

  • Natural Language Querying: Select this option to use the connection for natural language querying.

  • Natural Language Summary: Select this option to use the connection for natural language summary.

  • Metadata Generation (Beta): Select this option to use the connection for generating metadata.

Note: When you select the required usage, you can make the selected usage as a default connection. The checkbox will be displayed to set the selected usage as a default connection.
For example, if you select Natural Language Querying option from the Usage list, the Default Connection for Natural Language Querying checkbox will be displayed.

Default Connection for Natural Language Querying

Select this checkbox to set the default connection for natural language queries.

Allow sending data to LLM

Select Yes or No to specify whether the generated questions should include values or not.

Max data points for summary

Enter the value to configure maximum number of data points involved in generating the summary.

NOTE: The default value is 5000.

Input prompt token limit

Specify the maximum tokens allowed for a prompt in a single request for the current provider.

NOTE: The default value is 16000.

The minimum value is 0.

Output prompt token limit

Specify the maximum number of tokens shared between the prompt and output, which varies by model. One token is approximately four characters for English text.

NOTE: The default value is 4096. The minimum value is 0.

For 3 Pro with low thinking level, use 5120. For high thinking level between 8k to 10k based on SQL complexity.

Max retry count

Specifies maximum retry count attempted so that we get correct query.

NOTE: The default value is 2.

Summary records threshold

Specifies the similarity threshold for autocorrecting queries.

NOTE: The default value is 0.1

LLM Temperature

Specify the LLM temperature, which controls the level of randomness in the output. Lowering the temperature results in less random completions. The responses of the model become increasingly deterministic and repetitive as it approaches zero. It is recommended to adjust either the temperature or top-p, but not both simultaneously.

Top P

Manages diversity through nucleus sampling. Setting it to 0.95 indicates that half of all likelihood-weighted options will be considered. It is recommended to adjust either this parameter or the temperature parameter, but not both simultaneously.

Top K

Specifies an integer parameter that limits the model's word choices at each step to a fixed number of the most probable next tokens, discarding all others.

Frequency penalty

Copyright Kyvos, Inc. 2025. All rights reserved.