Best Practices for Working with Kyvos

Working with Kyvos — a cloud-native semantic layer for AI and BI — involves both technical and strategic best practices. Here’s a breakdown of best practices for effectively using Kyvos for optimal performance, scalability, and usability:

Understanding the Use Case: A clear understanding of the use case is essential for designing optimal solutions within the Kyvos Semantic Model.
Dataset Preparation: Identify and compile a list of datasets (tables/views), along with the corresponding dimensions, attributes, and measures derived from these datasets.
Dataset Classification: Classify datasets as Dimensions if they only contain dimension/attribute fields, and as Facts if they include measure fields or both dimensions and measures.
Data Modeling Best Practices
1. Star Schema Preference: Design your data in a star schema (fact and dimension tables), as Kyvos performs best with this structure.
2. Flatten Hierarchies Where Possible: Use flattened dimension tables to reduce joins and improve semantic model build performance.
Dataset Design Best Practices
1. SQL-Based Datasets: It is recommended to use SQL-based datasets for the following benefits:
  1. Allows you to perform calculations at the SQL level.
  2. Enables the use of partition columns to efficiently filter data.
2. Selective Column Inclusion: Only include the necessary columns in the SELECT clause of your SQL query to optimize performance.
3. Date Column Input Format: Ensure that the input format for date columns is clearly specified.
4. Handling Decimal Data: When dealing with decimal data, use the double data type in the Kyvos dataset if extremely high accuracy is not required for higher decimal places.
5. Row Selection: Fetch only the required rows from the SQL query to reduce unnecessary data retrieval.
Relationship Design Best Practices
1. Single-Direction Relationships: When defining relationships, ensure that they are created in a single direction, such as from Fact to Dimension or Dimension to Fact.
2. Fact-to-Fact Joins: Fact-to-Fact joins are not directly supported, so avoid creating relationships between fact tables.
Semantic model Best Practices
1. Start Small, Scale Gradually: Begin with fewer dimensions and measures; scale as needed. This keeps builds fast and manageable and also helps in quick debugging in case of any issues.
2. Partitioning: Use intelligent partitioning (like time-based) on large fact tables for faster processing and querying.
  1. Identify and choose the right fields for partitioning the semantic model. Proper selection of partition fields helps filter data during queries, thereby enhancing query performance.
  2. Choosing the correct partition field also ensures efficient processing of relevant data during incremental builds with the replace partition set as “Auto".
  3. The partition on model should match with the partition at source.
  4. For any no-Spark build we recommend having same partition strategy for both Fact table and Semantic Model. If there are differences then we may face issues such as -
    - High number of cuboids,
    - Jobs taking time during build to copy these small files
    - Or Jobs failing due to timeout.
3. Include Only Required Fields: Add only the necessary fields to the semantic model to ensure efficiency and avoid unnecessary complexity.
Best Practices for Performance Optimization
1. Build Scheduling: Schedule semantic model builds during off-peak hours or in a staggered manner to reduce resource contention.
2. Incremental Builds: Use incremental builds to update only new or changed data, which saves time and resources.
3. Caching Strategy: Tune caching based on usage patterns. Use warm-up queries post-build to populate the cache for faster user response.
Deployment Best Practices
1. Scaling: Use Kyvos' elastic architecture to dynamically scale based on workload demand in cloud environments.
2. Security Compliance: Leverage Kyvos’ support for cloud IAM (like AWS IAM, Azure AD) for secure access management.
Security and Governance Best Practices
1. Role-Based Access Control (RBAC): Assign data access and visibility rules based on user roles.
2. Data Masking and Row-Level Security: Use these features to ensure sensitive information is protected.
3. Audit Logs: Enable auditing to track user activity and system performance.
Integration with BI Tools Best Practices
1. Optimize SQL Queries: Kyvos translates BI tool queries into optimized semantic model queries. Understand how your tool (e.g., Tableau, Power BI, Excel) generates SQL.
2. Live Connections: For real-time insights, live connections to the Kyvos SQL interface are preferred over data extraction from BI tools.
3. Tool-Specific Tuning: Tune the dashboard and report design (e.g., filters and visuals) to minimize query complexity.
Monitoring and Maintenance Best Practices
1. Use Kyvos Monitoring Tools: Monitor job status, semantic model usage, and query performance through the Kyvos Manager and web portal.
2. Log Rotation and Housekeeping: Regularly clean up old logs and monitor storage usage to prevent bloat.
3. Version Upgrades: Stay current with Kyvos releases to leverage performance improvements and new features.
Collaboration and Documentation
1. Document semantic model Definitions: Maintain thorough documentation for all dimensions, hierarchies, and measures.
2. Team Collaboration: Use Git or other version control tools to collaborate on semantic model design and configuration in development environments.

Best practices for handling calculations

Calculated Field Type	Calculation Type	Pros	Cons
Attributes	SQL	Pre-calculated, so it gives better performance.	You need to process the semantic model each time you change the calculation.
	Tableau	It can be changed at runtime; there is no need to process the semantic model.	Calculated at runtime, hence it would impact performance.
Measure	SQL	Pre-calculated, so it gives better performance.	You need to process the semantic model each time you change the calculation.
	MDX	It can be changed at runtime; there is no need to process the semantic model. Complex calculations can be done here, like Time series calculations.	Calculated at runtime, hence it would impact performance.
	Tableau	It can be changed at runtime; there is no need to process the semantic model. Suitable for users who are more familiar with Tableau than MDX calculations	Calculated at runtime, hence it would impact performance.