diff --git a/docs/pages/product/deployment/cloud.mdx b/docs/pages/product/deployment/cloud.mdx index 3c602c79f36ba..3cc53cbb87f99 100644 --- a/docs/pages/product/deployment/cloud.mdx +++ b/docs/pages/product/deployment/cloud.mdx @@ -35,6 +35,7 @@ In Cube Cloud, you can: API endpoints for the source code in the main branch, any other branch, or any user-specific [development mode][ref-dev-mode] branch. * Assign a [custom domain][ref-domains] to API endpoints of any deployment. +* Review [performance insights][ref-performance] and fine-tune deployments for better [scalability][ref-scalability]. * Set up account-wide [budgets][ref-budgets] to control resource consumption and use [auto-suspension][ref-auto-sus] to reduce resource consumption of non-production deployments. @@ -51,4 +52,6 @@ In Cube Cloud, you can: [ref-dev-mode]: /product/workspace/dev-mode [ref-domains]: /product/deployment/cloud/custom-domains [ref-auto-sus]: /product/deployment/cloud/auto-suspension -[ref-budgets]: /product/workspace/budgets \ No newline at end of file +[ref-budgets]: /product/workspace/budgets +[ref-performance]: /product/workspace/performance +[ref-scalability]: /product/deployment/cloud/scalability \ No newline at end of file diff --git a/docs/pages/product/deployment/cloud/_meta.js b/docs/pages/product/deployment/cloud/_meta.js index 2baae2b3ca3c8..c3cb7295bdc1b 100644 --- a/docs/pages/product/deployment/cloud/_meta.js +++ b/docs/pages/product/deployment/cloud/_meta.js @@ -4,6 +4,7 @@ module.exports = { "continuous-deployment": "Continuous deployment", "custom-domains": "Custom domains", "auto-suspension": "Auto-suspension", + "scalability": "Scalability", "pricing": "Pricing", "support": "Support", "limits": "Limits" diff --git a/docs/pages/product/deployment/cloud/deployment-types.mdx b/docs/pages/product/deployment/cloud/deployment-types.mdx index 718998df1bada..cb071ff2fde08 100644 --- a/docs/pages/product/deployment/cloud/deployment-types.mdx +++ b/docs/pages/product/deployment/cloud/deployment-types.mdx @@ -64,8 +64,8 @@ Production Clusters are designed to support high-availability production workloads. It consists of several key components, including starting with 2 Cube API instances, 1 Cube Refresh Worker and 2 Cube Store Routers - all of which run on dedicated infrastructure. The cluster can automatically scale to meet the -needs of your workload by adding more components as necessary; check the -[Scalability section](#scalability) below. +needs of your workload by adding more components as necessary; check the page on +[scalability][ref-scalability] to learn more. ## Production multi-cluster @@ -96,59 +96,6 @@ Cube Cloud routes traffic between clusters based on Each cluster is billed separately, and all clusters can use auto-scaling to match demand. -## Scalability - -Cube Cloud also allows adding additional infrastructure to your deployment to -increase scalability and performance beyond what is available with each -Production Deployment. - -### Cube Store Worker - -Cube Store Workers are used to build and persist pre-aggregations. Each Worker -has a **maximum of 150GB** of storage; [additional Cube Store -workers][ref-limits] can be added to your deployment to both increase storage -space and improve pre-aggregation performance. A **minimum of 2** Cube Store -Workers is required for pre-aggregations; this can be adjusted. For a rough -estimate, it will take approximately 2 Cube Store Workers per 4 GB of -pre-aggregated data per day. - - - -Idle workers will automatically hibernate after 10 minutes of inactivity, and -will not consume CCUs until they are resumed. Workers are resumed automatically -when Cube receives a query that should be accelerated by a pre-aggregation, or -when a scheduled refresh is triggered. - - - -To change the number of Cube Store Workers in a deployment, go to the -deployment’s Settings screen, and open the Configuration -tab. From this screen, you can set the number of Cube Store Workers from the -dropdown: - - - -### Cube API Instance - -With a Production Deployment, 2 Cube API Instances are included. That said, it -is very common to use more, and [additional API instances][ref-limits] can be -added to your deployment to increase the throughput of your queries. A rough -estimate is that 1 Cube API Instance is needed for every 5-10 -requests-per-second served. Cube API Instances can also auto-scale as needed. - -To change how many Cube API instances are available in the Production Cluster, -go to the deployment’s Settings screen, and open -the Configuration tab. From this screen, you can set the minimum and -maximum number of Cube API instances for a deployment: - - - ## Switching between deployment types To switch a deployment's type, go to the deployment's Settings screen @@ -161,3 +108,4 @@ and select from the available options: [ref-conf-ref-ctx-to-app-id]: /reference/configuration/config#contexttoappid [ref-limits]: /product/deployment/cloud/limits#resources +[ref-scalability]: /product/deployment/cloud/scalability \ No newline at end of file diff --git a/docs/pages/product/deployment/cloud/scalability.mdx b/docs/pages/product/deployment/cloud/scalability.mdx new file mode 100644 index 0000000000000..10dcfcc944295 --- /dev/null +++ b/docs/pages/product/deployment/cloud/scalability.mdx @@ -0,0 +1,55 @@ +# Scalability + +Cube Cloud also allows adding additional infrastructure to your deployment to +increase scalability and performance beyond what is available with each +Production Deployment. + +## Auto-scaling of API instances + +With a Production Cluster, 2 Cube API Instances are included. That said, it +is very common to use more, and [additional API instances][ref-limits] can be +added to your deployment to increase the throughput of your queries. A rough +estimate is that 1 Cube API Instance is needed for every 5-10 +requests-per-second served. Cube API Instances can also auto-scale as needed. + +To change how many Cube API instances are available in the Production Cluster, +go to the deployment’s Settings screen, and open +the Configuration tab. From this screen, you can set the minimum and +maximum number of Cube API instances for a deployment: + + + +## Sizing Cube Store workers + +Cube Store Workers are used to build and persist pre-aggregations. Each Worker +has a **maximum of 150GB** of storage; [additional Cube Store +workers][ref-limits] can be added to your deployment to both increase storage +space and improve pre-aggregation performance. A **minimum of 2** Cube Store +Workers is required for pre-aggregations; this can be adjusted. For a rough +estimate, it will take approximately 2 Cube Store Workers per 4 GB of +pre-aggregated data per day. + + + +Idle workers will automatically hibernate after 10 minutes of inactivity, and +will not consume CCUs until they are resumed. Workers are resumed automatically +when Cube receives a query that should be accelerated by a pre-aggregation, or +when a scheduled refresh is triggered. + + + +To change the number of Cube Store Workers in a deployment, go to the +deployment’s Settings screen, and open the Configuration +tab. From this screen, you can set the number of Cube Store Workers from the +dropdown: + + + + +[ref-limits]: /product/deployment/cloud/limits#resources \ No newline at end of file diff --git a/docs/pages/product/workspace/_meta.js b/docs/pages/product/workspace/_meta.js index 86a6674ff76f0..300c2ae31e31d 100644 --- a/docs/pages/product/workspace/_meta.js +++ b/docs/pages/product/workspace/_meta.js @@ -8,7 +8,7 @@ module.exports = { "pre-aggregations": "Pre-Aggregations", "performance": "Performance Insights", "access-control": "Access Control", - "sso": "Single Sign-On", + "sso": "Single Sign-on", "budgets": "Budgets", "preferences": "Preferences", "cli": "CLI" diff --git a/docs/pages/product/workspace/performance.mdx b/docs/pages/product/workspace/performance.mdx index 49ec87353c369..6672141032fb4 100644 --- a/docs/pages/product/workspace/performance.mdx +++ b/docs/pages/product/workspace/performance.mdx @@ -1,11 +1,5 @@ # Performance Insights - - -This page is work-in-progress. - - - The Performance page in Cube Cloud displays charts that help analyze the performance of your deployment and fine-tune its configuration. It's recommended to review Performance Insights when the workload changes @@ -13,8 +7,10 @@ or if you face any performance-related issues with your deployment. -Performance Insights are available in Cube Cloud on -[all tiers](https://cube.dev/pricing). +Performance Insights are available in Cube Cloud on [Premium and above +tiers](https://cube.dev/pricing). Please contact us through the in-product +chat or check with your dedicated CSM to enable Performance Insights in +your account. @@ -25,60 +21,140 @@ Charts provide insights into different aspects of your deployment. ### API instances The API instances chart shows the number of API instances -that served queries to the deployment over time. +that served queries to the deployment. + +You can use this chart to **fine-tune the +[auto-scaling][ref-scalability-api] configuration of API instances**, e.g., +increase the minimum and maximum number of API instances. + +For example, the following chart shows a deployment with sane auto-scaling +limits that don't need adjusting. It looks like the deployment needs to +sustain just a few infrequent load bursts per day and auto-scaling to 3 API +instances does the job just fine: + + + +The next chart shows a deployment with auto-scaling limits that definitely +need an adjustment. It looks like the load is so high that most of the time +this deployment has to use at least 4-6 API instances. So, it would be wise +to increase the minimum auto-scaling limit to 6 API instances: + + + +When in doubt, consider using a higher minimum auto-scaling limit: when an +additional API instance starts, it needs some time to compile the data model +before it would be able to serve the requests. So, over-provisioning API +instances with a higher minimum auto-scaling limit would allow to decrease +the number of requests that had to wait for the [data model +compilation](#data-model-compilation). + +Also, you can use this chart to **fine-tune the +[auto-suspension][ref-auto-sus] configuration**, e.g., by turning +auto-suspension off or increasing the auto-suspension threshold. +For example, the following chart shows a [Development +Instance][ref-dev-instance] deployment that is only accessed a few times +a day and automatically suspends after a short period of inactivity: -{/* TODO: Add screenshot */} + -You can use this chart to fine-tune the auto-scaling configuration of API -instances, e.g., increase the minimum and maximum number of API instances. +The next chart shows a misconfigured [Production Cluster][ref-prod-cluster] +deployment that serves the requests throughout the whole day but was +configured to auto-suspend with a tiny threshold: -Also, you can use this chart to fine-tune the auto-suspension configuration, -e.g., by turning auto-suspension off or increasing the auto-suspension -threshold. + -### Data sources +### Cache type -The Requests by data source chart shows the number of API -requests that were fulfilled by using cache or querying the upstream data -source over time. The Avg. response time by data source -shows the difference in the response time for -requests that hit the cache or go to the upstream data source. +The Requests by cache type chart shows the number of API +requests that were fulfilled by using pre-aggregations, in-memory cache, +or no cache (i.e., by querying the upstream data source). For example, the +following chart shows a deployment that fulfills about 50% of requests by +using pre-aggregations: -{/* TODO: Add screenshot (x2) */} + + +The Avg. response time by data source shows the difference +in the response time for requests that hit pre-aggregations, in-memory cache, +or no cache (i.e., the upstream data source). The next chart shows that +pre-aggregations usually provide sub-second response times while queries to +the data source take much longer: + + You can use these charts to see if you'd like to have more queries that hit -the cache and have lower response time. In that case, consider adding more -pre-aggregations in Cube Store or fine-tune the existing ones. +the cache and have lower response time. In that case, **consider adding more +[pre-aggregations][ref-pre-aggregations] in Cube Store** or fine-tune the +existing ones. ### Data model compilation The Requests by data model compilation chart shows the number of API requests that had or had not to wait for the data model -compilation. The Wait time for data model compilation chart +compilation. For example, the following chart shows a deployment that +only has a tiny fraction of requests that require the data model to be +compiled: + + + +The Wait time for data model compilation chart shows the total time requests had to wait for the data model compilation. +The next chart shows that at certain points of time requests had to wait +dozens of seconds while the data model was being compiled: -{/* TODO: Add screenshot */} + -You can use these charts to identify multitenancy misconfiguration, -fine-tune the auto-suspension configuration, or consider using a -multi-cluster deployment. +You can use these charts to **fine-tune the [auto-suspension][ref-auto-sus] +configuration** (e.g., turn it off or increase the threshold so that API +instances suspend less frequently), **identify [multitenancy][ref-multitenancy] +misconfiguration** (e.g., suboptimal bucketing via +[`context_to_app_id`][ref-context-to-app-id]), or +**consider using a [multi-cluster deployment][ref-multi-cluster]** to +distribute requests to different tenants over a number of Production +Cluster deployments. ### Cube Store The Saturation for queries by Cube Store workers chart -shows if Cube Store workers are overloaded with serving **queries**. High -saturation for queries prevents Cube Store workers from fulfilling requests -and results in wait time displayed at the Wait time for queries -by Cube Store workers chart. +shows if Cube Store workers are overloaded with serving **queries**. +High saturation for queries prevents Cube Store workers from fulfilling +requests and results in wait time displayed at the Wait time for +queries by Cube Store workers chart. + +For example, the following chart shows a deployment that uses 4 Cube Store +workers and almost never lets them come to saturation, resulting in no wait +time for queries: -{/* TODO: Add screenshot */} + Similarly, the Saturation for jobs by Cube Store workers and Wait time for jobs by Cube Store workers charts show if Cube Store Workers are overloaded with serving **jobs**, i.e., building pre-aggregations or performing internal tasks such as data compaction. -{/* TODO: Add screenshot */} +For example, the following chart shows a misconfigured deployment that uses +8 Cube Store workers and keeps them at full saturation during prolonged +intervals, resulting in huge wait time and, in case of jobs, delayed refresh +of pre-aggregations: + + + +The next chart shows that oversaturated Cube Store workers might yield +hours of wait time for queries and jobs: + + + +You can use these charts to **fine-tune the [number of Cube Store +workers][ref-scalability-cube-store]** used by your deployment, e.g., +increase it until you see that there's no saturation and no wait time +for queries and jobs. + -You can use these charts to consider fine-tuning the number of Cube Store -workers used by your deployment. \ No newline at end of file +[ref-scalability-api]: /product/deployment/cloud/scalability#auto-scaling-of-api-instances +[ref-scalability-cube-store]: /product/deployment/cloud/scalability#sizing-cube-store-workers +[ref-auto-sus]: /product/deployment/cloud/auto-suspension +[ref-dev-instance]: /product/deployment/cloud/deployment-types#development-instance +[ref-prod-cluster]: /product/deployment/cloud/deployment-types#production-cluster +[ref-multi-cluster]: /product/deployment/cloud/deployment-types#production-multi-cluster +[ref-pre-aggregations]: /product/caching/using-pre-aggregations +[ref-multitenancy]: /product/configuration/advanced/multitenancy +[ref-context-to-app-id]: /reference/configuration/config#context_to_app_id \ No newline at end of file diff --git a/docs/pages/product/workspace/sso.mdx b/docs/pages/product/workspace/sso.mdx index 1af80dc4bee02..08830b7ac237b 100644 --- a/docs/pages/product/workspace/sso.mdx +++ b/docs/pages/product/workspace/sso.mdx @@ -3,7 +3,7 @@ redirect_from: - /workspace/sso/ --- -# Single Sign-On +# Single Sign-on As an account administrator, you can manage how your team accesses Cube Cloud. There are options to log in using email and password, a GitHub account, or a